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Chapter 1 


Introduction 


1.1 General motivation 


In the work on this thesis I have been connected to two larger projects: The De- 
terminants of Dialectal Variation at the University of Groningen to which I have 
been formally affiliated as a PhD student, and to SweDia—a collaboration between 
three Swedish universities with the more revealing subtitle Phonetics and phonology 
of the Swedish dialects in the year 2000. 

An aim for the research group in Groningen has been to develop dialectometric 
techniques that can be used to characterize linguistic variation in the aggregate. The 
goal is to be able to make conclusions about what determines linguistic variation 
by developing quantitative methodology for exploring data which can be used for 
analyzing large amounts of dialect data. Techniques for visualizing results in high 
quality maps have also been developed by the researchers in Groningen. In this 
thesis, I have wanted to contribute to this work by using both aggregate analysis for 
exploring dialectal variation in Swedish vowel pronunciation and detailed analysis 
of separate variables. The latter method corresponds to what has traditionally been 
done by dialectologists. By comparing the two different methodological approaches 
I have wanted to explore what kind of variation is accounted for in an aggregate 
analysis, and, in addition, show how the two methods can support each other and 
reveal different aspects of dialectal variation. I have also applied mapping techniques 
to specifically visualize pronunciation of vowels. 

For my research I have free access to data from the SweDia database. This dialect 
database is a joint effort by the phonetics departments of the universities in Lund, 
Stockholm and Umea. The aim of the SweDia project was to document the dialectal 
variation in rural varieties of Swedish around year 2000. The Swedish dialects have 
gone through massive leveling in the latter half of the 20th century. In this level- 
ing process especially morphological, syntactical and lexical variation has decreased 
profoundly. Phonetic and prosodic features are assumed to have been preserved to 
a larger degree. A large number of descriptions of phonetic and phonological con- 
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ditions in Swedish dialects as documented in the first decades of the 20th century 
exist. But there are not many studies dealing with the phonetics and phonology of 
modern non-standard varieties of Swedish. The SweDia database was compiled in 
order to make this kind of research possible. 

In the SweDia database not only geographic variation is accounted for but also 
social. At all of the more than one hundred sites included in the database, recordings 
were made with both men and women and with both older and younger speakers. 
In dialectometry, the perspective has usually been purely geolinguistic. Relations 
between sites have been analyzed, without accounting for sociolinguistic variation 
within sites. The SweDia data makes it possible to include sociolinguistic dimensions 
in dialectometric work and to study the amount of variation also within sites. 

General aims of the SweDia project which are also applicable to this thesis were 
to investigate what the distribution of dialectal features in the Swedish language 
area are, what the differences in dialect use are between different age groups, and if 
there are dialect areas where the leveling has not been as far-reaching as elsewhere. 

In the research team in Groningen as well as in the SweDia project a general 
goal has been to study theoretical aspects of language systems. This has been 
done by investigating associations between linguistic levels and setting up linguistic 
typologies. I hope that my work can contribute to these theoretical considerations 
as well. 


1.2 Terminology 


The term “dialect” is used in this thesis to refer to the varieties that were recorded 
for the SweDia database. These are modern rural varieties of Swedish, which have a 
variable degree of dialectality on a scale between traditional rural dialect and regional 
standard language. The use of the term does therefore not in all cases agree with 
how the term has traditionally been used by Swedish dialectologist, who usually have 
reserved the term “dialect” for local varieties which have not been heavily affected 
by the large-scale convergence towards Standard Swedish of the last decades. 

In a number of analyses differences between men and women are studied. The 
term “sex” is used when describing properties related to strictly anatomical /phys- 
iological differences between men and women, which are a consequence of women 
having two X chromosomes and men having a Y chromosome. This is the case when 
discussing differences in vowel spectra related to anatomical /physiological differences 
in the vocal tract. Whenever discussing expressions of human culture and social 
interactions, of which language use is considered to be a part, the term “gender” is 
used. 


1.3. Overview of the thesis 3 


1.3. Overview of the thesis 


In the next chapter the background for this thesis is given. The Swedish linguistic 
situation is described and an overview of variation in Swedish vowel pronunciation 
based on previous literature is given. Different approaches to acoustic analysis of 
vowels are discussed, and a short introduction to the dialectometric research tradi- 
tion in relation to traditional dialect geography is given. 

In Chapter 3 the aims and research questions for this study are specified, and in 
Chapter 4 the data set is described. A detailed description of the acoustic method 
used for assessing vowel quality is given in Chapter 5. 

A number of analyses of dialectal variation in Swedish vowel pronunciation are 
reported in Chapters 6 and 7. Detailed analyses of the different variables and of 
co-occurring vowel features are described in Chapter 6. In conjunction with the 
analysis of the variables, maps were created that display the variation in each vowel 
across sites and across age groups. These maps are found in Appendix C, which can 
be seen as a small atlas of Swedish vowel pronunciation. The results of a number of 
aggregate analyses are presented in Chapter 7. 

In Chapter 8 the results of all analyses are brought together and discussed, and 
the most important results are summarized in Chapter 9. 

Some of the maps in the thesis might seem small especially for readers who are 
interested in a specific region and would like to get a clearer view of that specific area. 
I found it more important to display related maps next to each other (for example 
maps of older and younger speakers) than to make full-page figures of every map. 
An electronic version of the thesis has been made available via the library of the 
University of Groningen (<http://dissertations.ub.rug.nl/>). A full-text PDF file 
can be downloaded which allows zooming in on the maps on the computer display. 


Chapter 2 
Background 


In this chapter the linguistic and theoretical background for the thesis is presented. 
In § 2.1 the status of dialects and regional varieties of Standard Swedish in Sweden 
and the Swedish-language parts of Finland is described. In § 2.2 different classifica- 
tions of the varieties of Swedish are presented; § 2.2.1 shows how the rural dialects 
have been classified, while § 2.2.2 shows the main regional varieties of Standard 
Swedish. Since the data for the present study comes from the SweDia database, 
classifications made based on some specific linguistic features within the SweDia 
project are described in § 2.2.3. 

In § 2.3.1 the Swedish vowel system is described. Variation in the pronunci- 
ation of the Standard Swedish vowel phonemes is described in § 2.3.2, while § 2.3.3 
shortly covers the main sources of variation in the very diverse vowel systems in the 
traditional rural dialects. 

As a background for the choice of acoustic methods for this thesis, different 
methods for measuring vowel quality acoustically are discussed in § 2.4. Measurment 
of formants is described in § 2.4.1, while different whole-spectrum approaches are 
discussed in § 2.4.2. The problem of speaker variability in acoustic measurements of 
vowels is explained (§ 2.4.3) and different solutions for normalizing for the speaker- 
dependent variation are discussed (§ 2.4.4). 

In § 2.5 different methods used in dialect geography and dialectometry are presen- 
ted as a basis for the choice of methods for analyzing dialectal variation in this thesis. 


2.1 Swedish dialects and standard language 


Swedish is a North Germanic language spoken as a first language by around nine 
million speakers. Out of these approximately 8.5-9 million live in Sweden (Bér- 
jars, 2006) and nearly 300,000 in Finland (Statistics Finland). Swedish is the main 
language in Sweden and one of two official languages in Finland. The Swedish pop- 
ulation in Finland lives mainly along the coasts and comprises 5.4% of the Finnish 
population (Statistics Finland). 
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The Swedish written language was standardized during the 18th century (Tele- 
man, 2005). The spoken standard language developed in the capital and was influ- 
enced by the court and the speech of higher social classes in Stockholm. By the 20th 
century the spoken and written standard language was well established (Thelander, 
2009). The regional variation in Standard Swedish today is generally on the pros- 
odic level, primarily intonation, but there is also variation on the sub-phonemic 
level, especially in vowel pronunciation. Local dialects show variation from Stand- 
ard Swedish on all linguistic levels and are not always intelligible for speakers of 
Standard Swedish. 


2.1.1 Standard Swedish 


The Swedish standard language is well-codified with a number of relatively recent 
dictionaries and handbooks! (Thelander, 2009). The Standard Swedish written lan- 
guage is uniform and there is a norm which is accepted in the whole language area. 
For the spoken language, however, a neutral standard variety, which would not be 
geographically identifiable, hardly exists (Garlén, 2003, 7-8). Rather, there are a 
number of regional varieties of Standard Swedish, which differ from each other when 
it comes to prosodic features and the pronunciation of certain phonemes. In the 
pronunciation dictionary Svenska spréknémndes uttalsordbok a rather broad defin- 
ition of Standard Swedish pronunciation is given: Standard Swedish pronunciation 
is defined as a pronunciation which can be generally accepted and used in the whole 
language area (“en uttalsform som kan accepteras och brukas allmant 6ver hela det 
svenska sprakomradet”) (Garlén, 2003, 7). The dictionary is not strictly normative, 
but the aim is to give recommendations which can be applied to most of the vari- 
eties of Standard Swedish. For many words several pronunciations are given in the 
dictionary. The aim is not to create a uniform spoken language, but to recommend 
forms which lead to good intelligibility in all parts of the language area (Garlén, 
2003, 8). 

In comparison to the relatively lax attitudes towards variation in Standard Swedish 
nowadays, exemplified by the definition of Standard Swedish in the pronunciation 
dictionary, the attitudes were much more rigid up until the 1970s. For example, 
there was a demand for news anchors in television and radio to speak neutral Stand- 
ard Swedish, that is, a standard variety which did not signal geographic provenance 
(Thelander, 2009). Because Standard Swedish had developed in the capital, the 
neutral Standard Swedish was affiliated with the Central Swedish speech tradition 
around Lake Malaren (close to Stockholm and Uppsala). This Central Swedish pro- 
nunciation is the one that is still used in most schematic descriptions of Swedish. 


'For example, Svenska Akademiens ordbok (SAOB, 1893-, historical dictionary), Svensk ord- 
bok utgiven av Svenska Akademien (SO, 2009, thesaurus), Svenska Akademiens ordlista (SAOL, 
2006, spelling dictionary and word-form dictionary), Svenska sprékndmndens uttalsordbok (Garlén, 
2003, pronunciation dictionary), Svenskt sprékbruk (2003, dictionary of idioms and collocations), 
Svenska Akademiens grammatik (Teleman, Hellberg, & Andersson, 1999, descriptive grammar), 
Sprakriktighetsboken (2005, handbook of frequently-asked language questions), Svenska skrivreg- 
ler (2008, writing rules), Handbok i svenska som andrasprék (2008, handbook of Swedish as a 
second language). 
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Since the 1970s the attitudes have become more tolerant and nowadays it is not 
uncommon to hear news anchors with marked local features in their pronunciation 
or a foreign accent (Thelander, 2009). According to Svensson (2005), one reason for 
the increased linguistic variation in the public domain is the increase in the length 
of compulsory education. Until the 1960s, most of the Swedes followed only six 
years of elementary school. Today, almost everyone follows at least eleven or twelve 
years of schooling, including non-compulsory upper secondary school. With the 
higher education level, a growing number of people participate in public debate and 
thereby the linguistic variation in public language has increased and the standard 
norm has weakened. Thelander (2009) argues that discussions about correct norms 
have become superfluous, because a majority of the Swedes feel secure in their use 
of the (regional) standard language. 

The Swedish spoken in Finland follows the same standard norms as Swedish 
in Sweden (that is, Central Swedish). However, the language-contact situation 
with Finnish continuously influences the Swedish spoken and written in Finland. 
Moreover, the fact that a language is used in two different countries with differ- 
ent societies and social systems naturally leads to some differences. According to 
Thelander (2009), not only the spoken language, but also the written language of 
Swedish speaking Finns can be recognized relatively easily by Swedes. In addition 
to Finnish, the regional standard variety of Swedish spoken in Finland has also been 
influenced by the Finland-Swedish dialects and some other languages (for example, 
Russian, German). Finland-Swedish also employs some features that are considered 
archaic in Sweden. Swedish-speaking Finns who live in areas dominated by Finnish 
are usually bilingual, and under the right circumstances individuals develop a bal- 
anced bilingualism with high proficiency in both Swedish and Finnish (Tandefelt, 
1996). However, the extent to which Swedish-speaking Finns have the opportunity 
to use Swedish in their daily lives has been shown to determine the development of 
high proficiency in Swedish and idiomatic use of the Swedish language (Leinonen & 
Tandefelt, 2000, 2007). 


2.1.2 From dialect diversity to leveled dialects 


Most European languages have recently gone through processes of dialect leveling. 
Auer (2005) has showed that, in spite of superficial heterogeneity in the dialect- 
standard constellations found in Europe, the chronological development from local 
base dialects to a spoken standard variety with only little variation can be described 
systematically with a few types. Industrialization has played on important role in 
this leveling of the base dialects. 

Swedish has shown extensive geographical variation from medieval times until the 
first half of the 20th century (Hallberg, 2005). More or less every parish had a dialect 
of its own, distinct from the neighboring dialects. These rural dialects were charac- 
terized by dialectal features at all linguistic levels: segmental, phonology, prosody, 
morphology, lexicon, semantics, syntax. The 20th century changed this linguistic 
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situation dramatically. The main causes were industrialization, urbanization and 
migration, all naturally connected to each other. 

Industrialization started in Sweden around 1870. Around 1900 more than half of 
the working population of Sweden was still employed in agriculture, but at the be- 
ginning of the 21st century agriculture employed less than three percent (Thelander, 
2005, 1905). The industrialization resulted in rapidly growing industrial communit- 
ies and lead to a “flight from the countryside”. In 1850 90% of the Swedish popula- 
tion still lived in the countryside, in 1980 only 17% (Hallberg, 2005, 1691). Industry 
grew until the 1960s. After that, it, too, lost importance and today two thirds of the 
population are employed in the service sector (Thelander, 2005, 1905). In this soci- 
etal shift, the rural life style, which had been dominant in earlier days, was almost 
completely replaced by an urban life style (Nordberg, 2005, 1759). 

The linguistic result of the societal shift was large-scale homogenization. In 
the cities and industrial communities, the dialects of immigrants mixed and became 
simplified. Examples of simplifications of the dialects are a replacement of the gram- 
matical three gender system by a two gender system and the loss of vowel phonemes 
when diphthongs merge with long vowels. In communities with local dialects very 
divergent from the standard language, the immigrants did not always learn the local 
variety but spoke Standard Swedish instead, and, hence, diffused features from the 
standard language into the local dialect (Nordberg, 2005, 1769). When keeping 
the contact with their original social networks, migrants also, to some extent, con- 
tributed to the diffusion of standard variants in the locality they had moved from 
(Nordberg, 2005, 1769). 

Apart from migrations, the school system also added to the dialect leveling. 
Especially in areas with divergent rural dialects “the local variety was counteracted 
at school up until the 1970s at least, even if it was seldom forbidden” (Nordberg, 
2005, 1767). 

One of the most important changes in the linguistic situation in Sweden during 
the past century is that while earlier many people grew up in a code-switching 
situation between dialect and standard language, today code-mixing best describes 
the language situation for the majority of the Swedish-speaking (Andersson, 2007, 
55). Today, the linguistic distance between local varieties and the standard language 
is generally so small that the two varieties cannot be seen as separate linguistic 
systems, but speakers make use of a gliding scale where the share of dialectal features 
and standard variants vary according to speech situation, speech partner and the 
degree of formality. 

Swedish spoken language can be categorized as belonging to one of the four 
following levels (Thelander, 1994; Hallberg, 2005): 


e rural local dialect 


e regional, leveled dialect 


regional standard language 


neutral standard language 
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As mentioned in the previous section a neutral Standard Swedish, which does not 
signal geographic provenance, hardly exists. The varieties regional, leveled dialect 
and regional standard language were results of the language homogenization of the 
past century. In this process, the most local, divergent dialectal features were lost, 
but features representative for a larger region persisted. This regionalization of 
local dialects was shown clearly in a study by Thelander (1979) of the dialect of 
Burtrask in Vasterbotten. The larger the geographic spread of a dialectal feature in 
the surroundings of Burtrask, the more prone the Burtrask subjects were to use the 
dialect variant instead of the standard variant. 

Auer (2005) calls the stage with intermediate varieties between standard language 
and local dialects a diaglossic repertoire (in contrast to the diglossic repertoire where 
speakers code-switch between their local dialect and a spoken standard language). 
Both a diaglossic and a diglossic situation can lead to the loss of local dialects. 

Today, at the beginning of the 21st century, the rural local dialects, are disap- 
pearing in Sweden, and most speakers are found somewhere on the scale between 
regional dialect and regional standard language. Only in some peripheral areas (es- 
pecially Upper Dalarna, Norrbotten and Gotland) local dialects are still spoken. In 
these areas, the local dialect and Standard Swedish are perceived as two separate 
linguistic systems and the speakers are bidialectal. In many of these places, however, 
speakers of the local dialect are found mainly among older people. 

It may seem contradictory that rural dialects are disappearing in times when 
the attitudes towards linguistic variation are relatively liberal (compare § 2.1.1). 
However, according to Auer this is not uncommon: “In the final stage before loss, the 
attitudes towards the now almost extinct base dialect are usually positive again, and 
folkloristic attempts at rescuing the dialect may set in — usually without success” 
(Auer, 2005, 29). In a study in Overkalix (Norrbotten) in 1988, Kallskog (1990) 
found overall positive attitudes towards the local dialect among junior high school 
students. 78% of the dialect-speaking students and 66% of the students who did not 
speak the local dialect but only Standard Swedish had positive attitudes towards 
the local dialect. The local dialect was affiliated with belonging to the home district, 
and being able to talk to one’s grandparents was mentioned as something positive 
about the dialect. However, only 36% of all junior high school students (grades 
7-9) in Overkalix were actually speakers of the local dialect. Of their parents, 70% 
spoke the dialect, which shows a remarkable decline. It turned out that many of 
the parents had experienced negative attitudes towards the dialect in their youth 
or had had to abandon the dialect when entering higher education. Many parents 
had chosen to speak Standard Swedish to their children in order to avoid problems 
at school or to make it easier for the children to get an education and job outside 
Overkalix later on. The attitudes of the previous generation, hence, were decisive 
for the declined use of dialect in the younger generation. 

In a study of attitudes towards a number of Swedish dialects by Bolfek Radovani 
(2000), 42% of the subjects answered that dialect can be used in all circumstances, 
which seems very liberal. However, the study also showed that the subjects in- 
terpreted the word “dialect” differently than linguists would. The varieties that the 
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subjects considered dialects would be categorized as leveled dialect or regional stand- 
ard language by Swedish linguists. This fits in with Auer’s typology according to 
which the repertoire is restructured when the rural dialects are lost, and the regi- 
olectal forms are now considered the “most basilectal way of speaking” (Auer, 2005, 
27). 

Even in the far-reaching processes of linguistic homogenization, differences between 
rural and urban communities do still exist. In the cities and towns, there is more 
social and linguistic stratification than in rural areas. A large number of studies of 
Swedish urban and rural communities have shown that higher social groups gener- 
ally use more standard variants, while local variants are preferred by lower social 
groups.” In rural environments, less socio-economically defined linguistic variation 
is found, because the population is more homogeneous. In rural settings, the net- 
works are generally also smaller and more close-knit, which influences the linguistic 
behavior (Nordberg, 2005, 1761). 

The Swedish dialects in Finland have had a stronger position throughout the 20th 
century than the dialects in Sweden (Reuter, 2005, 1655). One of the reasons for this 
is that industrialization reached Finland somewhat later than Sweden, starting in the 
1880s, and was slower in the initial phase. Until the 1960s the majority of the Finnish 
population still lived in a rural environment (Tandefelt, 1994). Another reason 
is that elementary school was not introduced in Finland until the 1920s (Reuter, 
2005, 1655). In the Swedish language area in Finland, especially in the province 
Osterbotten, bidialectalism is still common, and the majority of the speakers have a 
local dialect as their first language. This holds not only for the countryside but also 
for smaller towns (Ivars, 1996). 

However, in spite of positive attitudes towards local dialects, regionalization has 
also affected the Swedish dialects in Finland to some extent. The regionalization 
tendencies are stronger in the southern parts of the Finland-Swedish area (above all 
close to Helsinki) than in Osterbotten (Ivars, 2003; Sandstrém, 1996). In the region 
close to the capital in Finland, rural dialects have disappeared not only because of 
change towards Standard Swedish, but also because of language shift to the majority 
language Finnish (Tandefelt, 1988, 1994, 1996). 


2.2 Swedish dialect geography 


Since the 1930s numerous dialect geographic works, including maps, have been pub- 
lished describing the dialects in the Swedish language area (for an overview see 
Edlund, forthcoming). However, no comprehensive dialect atlas covering the whole 
language area has yet been compiled. The existing atlas works include only smal- 
ler parts of the language area, while dialect geographic works including the whole 
Swedish (or Nordic) dialect area are generally monographs dealing with some spe- 
cific features or words. In the Swedish language area, there is a stronger tradition 


?For example, Thelander (1979), Nordberg (1985), Hammermo (1989), Kallskog (1990), Ani- 
ansson (1996), Kotsinas (1994), Sundgren (2002). 
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for compiling dialect dictionaries than dialect atlases. A number of dictionaries and 
word lists covering smaller or larger dialect areas of the Swedish language area ex- 
ist. Of the dialect dictionary covering the dialects of Sweden— Ordbok dver Sveriges 
Dialekter (Reinhammar & Nystrém, 1991-2000)—unfortunately only three booklets 
have been published covering words in the range A-back. At the moment, no more 
booklets of the work are being published, but the archive put together for compiling 
the dictionary is open for researchers. The dictionary covering the Swedish dialects 
in Finland—Ordbok éver Finlands svenska folkmél (Ahlback & Slotte, 1976-2007)— 
has reached the word och and is being compiled at the Research Institute for the 
Languages of Finland. 

Lexical geography in the Wérter-und-Sachen tradition has been especially strong 
in the Swedish language area resulting in several monographs concerning a particular 
word or semantic field. These lexical studies have provided insight into phonological 
history and change, etymology and semantic development (Edlund, forthcoming). 
Numerous dialect geographic studies concerning phonetics and phonology also exist. 
These have often dealt with sound changes from a historical point of view. Mappings 
of morphology, syntax and prosody are less frequent, but a smaller number of studies 
dealing with these linguistic levels exist. 

Examples of more extensive dialect geographic works concerning specific regions 
of the Swedish language area are the work in five volumes by Gotlind & Landtmanson 
(1940-50) dealing with the dialects of Vastergétland, Stidschwedischer Sprachatlas 
by Benson (1965-70) and a dialect atlas of the northern part of Norrland by Hansson 
(1995). 

Standard works describing the Swedish dialects are Vara folkmél by Wessén 
(1969), which was first published in 1935, and Svenska dialekter by Pamp (1978). 
Pamp describes the dialects in each of the historical provinces of Sweden without 
suggesting any linguistic classification of the dialects, beyond the administrative 
province borders. Figure 2.1 displays the Swedish provinces. The provinces of 
Sweden are grouped as belonging to one of the three larger regions Gétaland, Svea- 
land and Norrland. Reference to the province names are used when discussing the 
results in the following chapters of this thesis. 

Pamp (1978) does not deal with the Swedish dialects in Finland. Descriptions of 
the Finland-Swedish dialects are found in Svenskan i Finland by Ahlback (1956) and 
Fran Pyttis till Nedervetil by Harling-Kranck (1998). Within the Finland-Swedish 
area a division according to provinces is usually applied. The provinces with Swedish 
population in Finland are displayed in Figure 2.1. 

Wessén (1969) suggested a linguistic classification of the Swedish dialects into six 
groups. This classification is described more closely in § 2.2.1. Elert (1994) proposed 
a division of the regional varieties of Standard Swedish that largely resembles the 
classification of the traditional rural dialects by Wessén. Elert’s classification of the 
regional varieties of Standard Swedish is described in § 2.2.2. 

Recently, data collected in the SweDia project (see § 4.1) have made it possible 
to conduct quantitative analyses of modern spoken Swedish. The data was gathered 
around year 2000, and within the project, Swedish dialects have been classified ac- 
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Figure 2.1. The historical provinces of Sweden and Swedish-speaking parts of 
Finland. Sweden is divided into the three larger regions Gétaland, Svealand and 
Norrland. 
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cording to some specific linguistic features. Since the data for this thesis comprises 
vowel data from the SweDia database, an overview of classifications made based on 
data of other linguistic levels from the same database is given in § 2.2.3. Compar- 
ing the results of different studies where SweDia data have been used makes sense 
because all the studies involve the same participants recorded at the same point 
in time. The studies of variation at different linguistic levels are therefore directly 
comparable. 

In Swedish dialectology, computational methods have not been very commonly 
used (Edlund, forthcoming). In some of the studies described in § 2.2.3, methods 
borrowed from the dialectometric research tradition (see § 2.5) have been applied 
(particularly cluster analysis), and Leinonen (2007) used cluster analysis and multi- 
dimensional scaling for analyzing vowel pronunciation in Finland-Swedish dialects. 
However, Swedish dialect geographic works where quantitative methods are used 
have generally focused on some specific part of the language system, rather than 
aiming at an aggregate analysis, which is what has been the main focus of dialecto- 
metry. Aggregate dialectometric analyses of the whole Swedish language area do not 
exist so far. In the dialect atlas of the northern part of Norrland (Hansson, 1995) 
some summarizing dialectometric maps are included. 


2.2.1 Classification of Swedish rural dialects 


According to Wessén (1969, 12-13), the rural Swedish dialects have formed a con- 
tinuum without any sharp dialect borders. Neither has this continuum been disrup- 
ted by national borders in Scandinavia; Danish and Norwegian dialects belong to 
the same continuum. Even though Wessén recognized that no abrupt borders exis- 
ted between dialect areas, he motivated a classification of the dialects with practical 
reasons: a sketch of a dialect division will help to give an overview of the varying 
linguistic phenomena. Wessén (1969) described the Swedish dialects as belonging to 
six main dialect areas: 


e South Swedish dialects (sydsvenska mal) 
e Gotaland dialects (gétamal) 

e Svealand dialects (sveamdl) 

e Norrland dialects (norrldéndska mal) 

e Gotland dialects (gotléndska mal) 


e Finland-Swedish dialects (éstsvenska mal) 


Figure 2.2 (left) shows the approximate areas. The Svealand dialects are divided into 
three sub-groups: East Central Swedish (Sw. uppsvenska), Middle Central Swedish 
(Sw. mellansvenska) and the dialects of Dalarna (Sw. dalmdl). The classification is 
based mainly on phonetic, phonological and morphological features, viewed from a 
historical perspective. The division has been commonly used by Swedish dialectolo- 
gists. 
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Figure 2.2. The map to the left shows the classification of Swedish rural dialect 
according to Wessén (1969), while the map to the right shows the division of modern 
spoken Swedish proposed by Elert (1994). The divisions are rather similar. 


The South Swedish area includes dialects in the provinces Skane, Blekinge and 
southern parts of Halland and Smaland. 

The province Vasterg6tland is the center of the Gétaland dialects. Other provinces 
that Wessén includes in the G6taland area are Dalsland, northern Smaland, north- 
ern Halland and the south-west of Ostergétland. Varmland is also included in the 
G6taland area, even though it has a special status when it comes to many features. 
Bohuslan is a transitional area between South Swedish, Gétaland and Norwegian 
dialects. Some Gétaland features have spread via Vérmland and western parts of 
Vastmanland to the north. 

The center of the Svealand dialects is in Uppland. Uppland together with 
Gastrikland, south Halsingland, south-east Dalarna, eastern parts of Vastmanland 
and northern and eastern parts of S6dermanland form the East Central Swedish area. 
Narke and the rest of S6dermanland form a transitional area between Svealand and 
Gotaland dialects, and the same is true for Ostergétland, north-east Smaland and 
Oland. These dialects are called Middle Central Swedish. The dialects in Dalarna 
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are very conservative and divergent and comprise a separate group within the Svea- 
land dialects. 

The Norrland dialects are spoken along the Swedish cost from north Halsingland 
to Norrbotten in the north. Additionally, the dialects in Jamtland can be included 
in the Norrland area, even if they share a number of features with Norwegian dia- 
lects. The dialects in Harjedalen and north-west Dalarna are of a Norwegian type, 
according to Wessén. 

The Swedish dialects in Finland have comprised an East Swedish dialect area 
together with the Swedish dialects in Estonia (spoken along the western coast of 
Estonia). However, most of the Swedish population of Estonia fled to Sweden during 
the Second World War and Estonian-Swedish dialects are almost extinct today. The 
Finland-Swedish dialects share many features with the East Central Swedish ones. 

The dialects on the island Gotland have preserved many conservative features 
and are very different from mainland-Swedish ones. 


2.2.2 Regional varieties of Standard Swedish 


The division of the regional varieties of Standard Swedish proposed by Elert (1994) 
is displayed in Figure 2.2 (right). A comparison of the division of rural Swedish 
dialects by Wessén (1969) in the same figure shows many similarities. Elert (1994) 
proposed three main varieties of Standard Swedish: South Swedish, Central Swedish 
and Finland-Swedish. Central Swedish was further subdivided resulting in a division 
into seven groups: 


e South Swedish (sydsvenskt talsprék) 

e East Central Swedish (éstmellansvenskt talsprak) 

e West Central Swedish (gétiskt-vdstmellansvenskt talsprak) 

e the spoken language of Bergslagen (bergslagstalsprék) 

e the spoken standard language of Norrland (norrlandsstandardsvenskt talsprak) 
e the spoken language of Gotland (gotldnskt talsprék) 

e Finland-Swedish (finlandssvenskt talsprék) 


Elert (1994) based his division mainly on sentence intonation and differences in vowel 
pronunciation. In addition, some varieties are characterized by salient features like 
the use of dorsal /r/ in South Swedish or the lack of the word accent distinction in 
Finland-Swedish. The regional variation in vowel pronunciation in Standard Swedish 
is discussed in more depth in § 2.3.2. 


2.2.3 Typologies based on specific features 


Based on data from the SweDia database (see § 4.1), Bruce (2004) classified Swedish 
dialects according to intonational variation. The intonational parameters of the 
model were focal accentuation, phrasing, word accentuation and compounding. Seven 


16 Chapter 2. Background 


distinct dialect regions were identified, largely corresponding to the ones found by 
Elert (1994) (§ 2.2.2). Bruce (2004) called the seven intonational types South, West, 
Central, East, Far East S, Far East N and North. Far East stands for Finland- 
Swedish dialects, which were divided into two subtypes: Southern (the south coast) 
and Northern (the west coast). 

Schaeffler (2005) also used data from the SweDia database. He used cluster ana- 
lysis for classifying the Swedish dialects based on phonetic variation in quantity. 
Schaeffler found a division into three main types: Southern Swedish (up to Upp- 
land and Middle Dalarna), Northern Swedish (Norrland and Aland) and Finland- 
Swedish (mainland Finland-Swedish dialects). The three areas are separated mainly 
by consonant length. The Finland-Swedish dialects are characterized by a shorter 
consonant than the two other areas in V:C sequences, while the Southern Swedish 
area shows a markedly short consonant in VC: sequences. In the Northern area 
more preaspiration was found than in the two other areas. The phonetic differences 
between the three types could be connected to the phonological systems of the dia- 
lects, that is, the presence or absence of VC and V:C: syllables in stressed positions 
(see § 2.3.3). 

Lundberg (2005) used clustering methods for analyzing differences in the pronun- 
ciation of the vowel in the word lat in Swedish dialects analyzed with Mel-frequency 
cepstral coefficients (MFCCs, see § 2.4.2). The data of the study comprised older 
male speakers from the SweDia database. The geographic variation in the Swedish 
vowel a elicited with the word lat was studied and three clusters representing differ- 
ent variants of the vowel were found. The study showed that clustering should not 
be applied without evaluation. Principal component analysis was used to establish 
which MFCCs were important for identifying the clusters. 


2.3. Swedish vowels 


The Standard Swedish vowel system is described below in § 2.3.1. As mentioned 
above there is no neutral, geographically and socially non-identifiable, Standard 
Swedish, but a number of regional varieties of Standard Swedish exist. In § 2.3.2 
the regional differences in vowel pronunciation are described. 

The vowel systems of Swedish rural dialects differ from Standard Swedish both 
phonetically and phonologically. In § 2.3.3 an overview of some general features 
concerning a number of Swedish dialects is given. 


2.3.1 The Standard Swedish vowel system 


Standard Swedish has eighteen vowel phonemes, nine long and nine short ones cor- 
responding to each other pairwise. Table 2.1 displays these vowels and their Swedish 
orthographic equivalents. The correspondence between the long and short vowels is 
not only an orthographical one, but the correspondence exists in the linguistic com- 
petence of native speakers and is founded on phonetic similarity as well as morpho- 
phonological alternations (Linell, 1973, 8). 
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Table 2.1. The Standard Swedish Swedish Long Short 

vowel phonemes (Hedelin, 1997). letter vowel vowel 
a /a:/ /a/ 
e /e:/ /¢/ 
i /iz/ /1/ 
0 /u:/ /v/ 
u /3:/ /e/ 
y /y:/ /y/ 
a /03/ /9/ 
ii /e:/ /¢/ 
y /9:/ /9/ 


The phonemes are displayed with IPA symbols used for Standard Swedish in the 
pronunciation dictionary Norstedts svenska uttalslerikon (Hedelin, 1997). The pro- 
nunciation dictionary Svenska spréaknémndens uttalsordbok (Garlén, 2003) appeared 
a few years later, but does not use IPA symbols for all vowels, which is the reason 
why symbols in Norstedts svenska uttalslexikon were chosen to denote the Stand- 
ard Swedish vowel phonemes and Standard Swedish pronunciation throughout this 
thesis. 

A few differences exist between the two mentioned Swedish pronunciation dic- 
tionaries. The long a, transcribed [a:] by Hedelin (1997), is transcribed [p:] in the 
more recent dictionary. Garlén (2003, 31) describes the Swedish long a vowel as 
a slightly rounded open back vowel. Hence, the actual pronunciation is something 
between unrounded [a:] and rounded [n:]. 

The other vowel for which the two dictionaries have used different IPA symbols 
is the short 6. Hedelin (1997) uses [9], while Garlén (2003) uses [ce]. According 
to Garlén (2003) the pronunciation is, thus, more open than according to Hedelin 
(1997). However, according to both authors the pronunciation of short 6 is more 
open than the pronunciation of long 6, which is probably more important than the 
exact degree of openness of the vowel. 

According to Elert (1997) in the Introduction to Norstedts svenska uttalslexikon, 
the Swedish long o is somewhat more open than the cardinal vowel [u]. Therefore 
the phonetic symbol [a:] is used in Norstedts svenska uttalslezikon. The symbol [o| 
denotes a semi-high back rounded vowel, but was marked as obsolete by IPA in 
1989 and is not used in the newest version of the International Phonetic Alphabet. 
Throughout this thesis [u:] is therefore used for the Standard Swedish long o. 

The pronunciation of long u in Standard Swedish is actually more fronted than 
the symbol [#:] suggests. According to Elert (1997) a more precise phonetic symbol 
would be [u:]. 

In addition to the eighteen phonemes, more open allophones of @ and 6 are used 
when these vowels are followed by /r/ (which is equal to [r] or a retroflex consonant 
resulting from the consonant combinations rd, rl, rn, rs and rt): 
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© /ex/ — [ar] 
/¢/ — [e] 
© /o:/ — [cer] 


© /o/ — [ce] 


In an acoustic analysis of Swedish long vowels, Eklund & Traunmiiller (1997) showed 
that only /a:/ and [z:] are plain monophthongs.’? Especially the mid vowels, /o:/, 
/o:/ and /e:/, showed substantial diphthongization. These vowels are pronounced 
as opening diphthongs; F1 increases in the course of the vowels, and F2 decreases 
in the front vowels, /g:/ and /e:/, and increases in the back vowel /o:/. The close 
vowels (/i:/, /y:/, /ax/ and /u:/) showed smaller formant movements than the mid 
vowels. The close vowels first become even more close and only at the very end more 
centralized. The subjects (six male and six female in the age range 20-58) were from 
the greater Stockholm area. 

Describing the Swedish vowels with distinctive features has turned out to be com- 
plicated. A simple matrix, which in addition to the length distinction includes three 
degrees of openness, a front—back distinction, and a roundness distinction for front 
vowels, can be set up (Table 2.2). The problem is that this simple matrix does not 
correspond to “an equally simple structure of articulatory or acoustic facts” (Fant, 
1971, 259). Linell (1973) summarizes the problems involved in a phonological de- 
scription of Swedish vowels and reviews the most common solutions suggested. The 
biggest problem is the phoneme /#:/ and its short counterpart /o/. In articulatory 
terms /u:/ is a close or near-close and extremely rounded front vowel, while /e/ is 
a mid vowel. The articulatory and perceptual distance between the long and short 
variant is hence large. In the different suggested phonological interpretations, the 
position of /#:/ and /e/ has varied in both the height dimension and on the front— 
back scale. Linell (1973) proposes that /#:/ should be treated as a central vowel 
phonologically. Other researchers treat /#:/ as a front vowel and use three degrees 
of roundness (unrounded, out-rounded, in-rounded) for distinguishing /#:/ from /i:/ 
and /y:/ (Traunmiiller & Ohrstrém, 2007) or from /e:/ and /:/ (Malmberg, 1956). 

A problem with the simple solution in Table 2.2 is that it pushes 6 into a more 
open position than articulatory and acoustic data suggest, and that u is not grouped 
with the other close vowels, with which it shares important features (see, for example, 
§ 2.3.2.4). Other problems with phonological descriptions of the Swedish vowels have 
been how to fit in the pre-/r/ variants of @ and 6, and that the long a is a back 
vowel but it’s shorter counterpart a central vowel. Linell (1973, 12) points out that, 
interestingly enough, the vowels that present a problem for the phonological descrip- 
tion of the Standard Swedish vowels are the same that seem to show considerable 
variation across Swedish dialects. 


3 /e:/ was not included, because the subjects were asked to pronounce the Swedish letter names. 
The name of the letter & is pronounced [ze:], while all other letters are pronounced as the corres- 
ponding long vowels in Table 2.1. 
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Table 2.2. Distinctive features of Swedish vowels. This symmetrical matrix does 
not correspond to an equally simple structure in articulatory and phonetic terms. 


unrounded rounded back 

front vowels front vowels vowels 
close a(:) y() o(:) 
mid e(:) u(:) G(:) 
open G(:) 6(:) a(:) 


2.3.2 Regional variation in vowel pronunciation 


The IPA symbols in Table 2.1 are based on Central Standard Swedish. Regional 
varieties of Standard Swedish show large differences to Central Standard Swedish 
when it comes to the pronunciation of the vowel phonemes. This concerns espe- 
cially the long vowels. In the following, the variation in the most variable vowels is 
described following Elert (2000). 

Some regional features affect a number of vowel phonemes in a similar way, while 
others concern only individual vowels. In §§ 2.3.2.1-4 below features are described 
that are characteristic for specific geographic regions and affect several vowels in 
these regional varieties in a similar way. In §§ 2.3.2.5-9, on the other hand, geo- 
graphic variation in individual vowel phonemes is described. 


2.3.2.1 South Swedish diphthongization 


In South Swedish spoken language the long vowels are pronounced as rising* diph- 
thongs (Elert, 2000, 38-40). The long front vowels are closing diphthongs that start 
with a more open pronunciation and end approximately with the Standard Swedish 
vowel quality. The long back vowels start as unrounded central vowels and move 
backwards to the Standard Swedish vowel quality. The close vowels are diphthong- 
ized more strongly than the open vowels. For example, /i:/ > [°i(j)], /u:/ > [u(u)], 
/@:/ > [®o:] (Elert, 2000, 38). 

The amount of diphthongization varies within the South Swedish area, and also 
depends on how standard-like the speaker talks. The diphthongization is strongest 
in the (north-eastern part of the) province of Skane, but is found in the whole South 
Swedish area. 


2.3.2.2 East Central Swedish diphthongization 


In Central Sweden, mainly in western and southern parts of S6dermanland and south 
Vastmanland, the long vowel phonemes are often pronounced as falling*, centering 
diphthongs (Elert, 2000, 40-42). The vowel quality at the beginning of the vowels 
corresponds to the Standard Swedish pronunciation, followed by an [e]- or [o]-like 


4The terms “falling” and “rising” are used here for denoting which part of the diphthong that is 
the prosodically most prominent one. A rising diphthong ends with the more prominent part (e.g., 
[“g:]), while a falling diphthong starts with the more prominent part (e.g., [9:°]). 
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quality. Sometimes laryngalization occurs between these two elements. Examples, 
/ix/ > [ix(j)?], /u:/ > [u:(B)°], /9:/ > [9°] (Elert, 2000, 40). 

This diphthongization has been shown to be prosodically conditioned (Bleckert, 
1987). It is strongest in stressed long vowels and at the end of sentences. It is a 
rather new feature that was first attested in some Central Swedish towns at the 
beginning of the 20th century. Strong diphthongization is not commonly accepted 
as Standard Swedish (Elert, 2000, 42). 


2.3.2.3 Gotlandic diphthongization 


The dialects on Gotland are characterized by both archaic diphthongs and secondary 
diphthongs (see § 2.3.3). Some of these diphthongs are only part of the rural dialects, 
but also in standard-like Swedish spoken on Gotland some of the long vowels are 
diphthongized (Elert, 2000, 42-43). The diphthongization concerns mainly the mid 
vowels, but the pattern is not as clear as in South Swedish and East Central Swedish 
diphthongization. Some of the diphthongs are falling, while others are rising; some 
are closing, while others are centering: /e:/ > [er], /e:/ > [Pex], /a:/ > [Fa], /o:/ > 
[or], /u:/ > [Pur], /or/ > [or®] (/i:/, /y:/ and /a:/ are monophthongs) (Elert, 2000, 
42). 


2.3.2.4 Semi-vowel/fricative ending in long close vowels 


The long close vowels are followed by a consonantal segment in large parts of Sweden: 
/i:/ > [P], /y:/ > [y"], /a:/ > [28] and /u:/ > [u®] (Elert, 2000, 43-44). The 
consonantal ending is most prominent in word final position or before another vowel. 
This feature is common in the spoken language in the Central Swedish area. In 
South Swedish it is less prominent and in Norrland and Finland it hardly exists. 


2.3.2.5 Damped 7 and y 


The Swedish “damped” i and y are common in the East Central Swedish area (Elert, 
2000, 44-45). These variants of the Swedish /i:/ and /y:/ are pronounced with a 
markedly low F2, so that they sound more retracted than in Standard Swedish, 
closer to [i] and [#] (Bjérsten & Engstrand, 1999). They are also co-articulated 
with a fricative consonant, which gives a characteristic buzzing sound, and, hence, 
are often transcribed as [i’] and [y”] (which is not entirely appropriate because the 
buzzing is present in a larger part of the vowel segment and not only at the end). 

The damped pronunciation of i and y has a special distribution. It can be found 
in some scattered rural dialects, but also in the spoken language in Stockholm and 
G6teborg. In the areas where the damped pronunciation is part of the rural dialect 
speakers tend to leave out this feature when they want to speak more standard-like, 
while in the cities, Stockholm and Goteborg, the damped pronunciation is considered 
posh (Elert, 2000, 45). 
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2.3.2.6 Long and short e and &@ 


In large parts of the language area the short vowels /e/ and /e/ are two separate 
phonemes (EFlert, 2000, 46). However, in East Central Swedish and in the varieties 
on Gotland and in Finland the two phonemes have merged. Johansson (1982, 94) 
found that /e/ and /¢/ were maintained as two separate phonemes by most speakers 
in a number of towns in Norrland. For young female speakers, however, she found 
a perceptual merger of these two phonemes. The merger seems to be more common 
among younger speakers than among older speakers today, and might be regarded 
as unmarked standard pronunciation nowadays. 

In a smaller geographic area the corresponding long vowels /e:/ and /e:/ have 
merged, too, both being pronounced [e:]. However, it is not a complete merger, but 
before /r/ the two phonemes are kept apart. So there is a distinction between, for 
example, lera [le:ra] ‘clay’ and léra [leetra] ‘learn’, but not between leka [le:ka] ‘play’ 
and ldka [lexka] ‘heal’. The merger has been part of the traditional dialects in the 
surroundings of Stockholm and is also found in Finland. Contrary to the merger 
of the short e and @ the merger of the corresponding long vowels is not accepted 
in Standard Swedish pronunciation and there seems to be a reversal of the merger 
going on (Elert, 2000, 46). 

In a study of the language of teenagers in Stockholm, Kotsinas (1994) found 
three variants of /e:/: [ex], [ex] and [sex]. The pronunciation [e:] was more frequent 
in the lower social class than in the higher social class, while [ze:] was more frequent 
in the higher social class. The open pronunciation [ze:] could be interpreted as a 
reaction against “uncultivated” Stockholm-speech. 


2.3.2.7 Long and short u and 6 


The pronunciation of /¢:/ shows considerable variation (Elert, 2000, 47-48). In 
East Central Sweden a more open pronunciation, [ce:], has become popular among 
younger speakers. Kotsinas (1994) found that the open pronunciation was the most 
common one among Stockholm teenagers in all social groups. According to Elert 
(2000, 47), the more open pronunciation is common across generations in some parts 
of Sweden, for example Ostergétland, Vastergétland and Varmland. 

The vowel written with the letter u in Swedish had the pronunciation [u(:)] in 
the 12th and 13th centuries when the Latin alphabet was introduced in Scandinavia. 
After that the vowel has been fronted leading to [#:/#:] and [o] in modern Swedish. 
This process has not had the same speed across the entire language area. Some rural 
dialects and regional standard varieties have a more central or back pronunciation of 
u than Standard Swedish, for example, Finland-Swedish and varieties in south-west. 
Skane, Dalarna and north Vastmanland (Elert, 2000, 49). 

The short vowels /e/ and /@/ are merged in the speech of many speakers in East 
Central Sweden (Elert, 2000, 48). Wenner (2010) showed that younger speakers in 
Uppland have a shorter acoustic distance between /e/ and /9/ than older speakers, 
which suggests an ongoing change. The merger is more common when an /r/ is 
preceding or following the vowel than in other contexts. 
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2.3.2.8 The open allophones of Gd and 6 before r 


As mentioned above, the @ and 6 vowels have more open allophones that are used 
only before /r/. However, the degree of openness of these allophones varies. The 
open allophones are pronounced most open along the east coast and in Finland 
(Elert, 2000, 48-49). In the west they are more close and in some places there are 
no allophones, so that the vowels have the same pronunciation in all positions. 

In a study of attitudes towards Swedish dialects by Bolfek Radovani (2000) not 
only the pronunciation [e:] of /e:/ in Uppland, but also the extremely open [z:] 
before /r/ were seen as stigmatized variants of which the dialect speakers themselves 
were highly aware. 


2.3.2.9 Long and short a 


The long a vowel is a slightly rounded open back vowel in Standard Swedish. Among 
some speakers in Stockholm and Géteborg the vowel is even more rounded and close, 
almost [9:] (Elert, 2000, 50). In Finland the vowel is generally more fronted and 
unrounded: [a:]. 

In a few areas, the short vowel /a/ has a pronunciation closer to [ze]. Elert (2000, 
51) mentions east Smaland and Eskilstuna. 


2.3.3 Vowel systems of the Swedish dialects 


The rural dialects in the Swedish language area differ from Standard Swedish at 
practically all linguistic levels (segmental, prosody, phonology, morphology, lexicon, 
semantics, syntax). The vowel systems of the dialects differ from Standard Swedish 
not only in the pronunciation of the vowel phonemes, but also in the number of 
phonemes. 

The differences in the vowel systems of the Swedish dialects in comparison to 
Standard Swedish are partly due to archaic features and partly due to innovations. 
Archaic features are, for example, the preservation of the Proto-Nordic diphthongs 
/ai/, /au/ and /eu/. These diphthongs were monophthongized in large parts of east- 
ern Scandinavia around 900-1100 and partly merged with original monophthongs. 
In Standard Swedish these original diphthongs have been replaced by long vowels. 
The monophthongization spread from the south and never reached some peripheral 
dialect areas. Dialects on Gotland, in parts of Finland and in parts of Norrland were 
never affected and have preserved the original diphthongs (Pettersson, 2005). 

Another type of diphthongs in the Swedish dialects are the so-called secondary 
diphthongs. These have developed from original monophthongs which have been 
diphthongized in some dialects. Secondary diphthongs are characteristic for the 
southern provinces of Sweden, but are also found on Gotland and in some Finland- 
Swedish dialects. 

Another archaic feature is the preservation of the Proto-Nordic long a. During the 
13th and 14th centuries the pronunciation of this phoneme became more close and 
back resulting in the 4 [o:] in modern Swedish. This change, too, never reached some 
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peripheral areas (for example Gotland), which preserved a more open pronunciation 
[ai] [ar]. 

The Swedish quantity shift, around 1200, affected the vowel systems of the dia- 
lects in different ways. In Proto-Nordic four types of syllables could occur in stressed 
positions: VC, V:C, VC: and V:C:. The quantity shift resulted in the loss of VC 
and V:C: sequences in stressed syllables. In modern Standard Swedish and most 
dialects all stressed syllables have either a long vowel followed by a short consonant 
(V:C) or a short vowel followed by a long consonant or a consonant cluster (VC: or 
VCC). During the quantity shift, the new system was reached in different ways by 
different dialects. In some dialects the vowel in VC sequences was lengthened, while 
in other dialects the consonant was lengthened. Also the quality of the short vowels 
was affected differently in different dialects (Pettersson, 2005, 224-227). Some dia- 
lects were not affected at all, or only partly, by the quantity shift (mainly dialects 
in Norrland and Finland). 

Even though there is a long tradition of dialect research in Sweden and Finland, 
the phoneme systems of Swedish dialects as spoken nowadays are largely unknown. 
Structural descriptions of the vowel systems of dialects are rare, and because of 
the large changes that the dialects have gone through since the middle of the 20th 
century older descriptions of the phoneme systems are not necessarily valid anymore. 
Many dialects have not been the subject of recent studies, and even for dialects that 
have recently been studied, describing the phoneme system is not always easy. In a 
phonological analysis of the Swedish dialects of the province Osterbotten in Finland, 
known for well-preserved rural dialects, Wiik (2002) concluded that especially the 
long vowel systems are difficult to describe. The traditional rural dialects have more 
long vowels than short vowels, mainly due to diphthongs among the long vowels. 
However, because of dialect leveling the long vowel systems are unstable. There 
are, for example, tendencies to monophthongize diphthongs, and in words borrowed 
from Standard Swedish phonemes that have not originally been part of the rural 
dialect, are used. The study by Wiik (2002) included the dialects of 30 parishes in 
Osterbotten. Only a few of these dialects showed identical long vowel systems. The 
number of long vowel phonemes varied between ten and sixteen, while the number 
of short vowel phonemes was less variable (between seven and nine short vowel 
phonemes). 


2.4 Acoustic analysis of vowels 


When studying phonetic differences between languages or language varieties we can 
choose to work either with phonetically transcribed data or with acoustic analysis 
of speech samples. A disadvantages of phonetic transcription is that it requires a 
large amount of work and still is not a very exact method. Linguistic experience will 
influence the decisions even of highly trained transcribers (Dioubina & Pfitzinger, 
2002), and transcribers do not even always agree with themselves when transcribing 
the same utterance multiple times (Pfitzinger, 2003). 
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Numerous efforts have been made to assess phonetic quality of speech segments 
directly from the acoustic signal. The most common way of measuring vowel quality 
acoustically are formant measurements, explained in § 2.4.1. Another approach are 
so-called whole-spectrum methods. Different whole-spectrum methods are discussed 
in § 2.4.2. 

Analyzing vowel quality acoustically is a more objective method than using tran- 
scriptions, but does not come without its own problems. The acoustic signal in- 
cludes information that is not linguistically meaningful, but has to do with anatom- 
ical/physiological differences between speakers (§ 2.4.3). Transcribers are very good 
at ignoring spectral differences that are due to speaker-specific variation and report 
the actual linguistic differences that they perceive. But when working with acoustic 
speech data, normalizing for the non-linguistic speaker-dependent variation has al- 
ways been problematic. Different approaches to speaker normalization are discussed 
in § 2.4.4. 


2.4.1 Measuring formants 


Since the work of Peterson & Barney (1952) formant measurements have been the 
classical way of measuring vowel quality. The two first formants (i.e. the lowest 
resonance frequencies of the vocal tract during pronunciation) are the most distinct- 
ive acoustic parameters that determine vowel quality. These correspond relatively 
well to the articulatory vowel features height and advancement. This can be seen in 
the spectrograms in Figure 2.3. The horizontal axis shows time, while the vertical 
axis shows frequency. The intensity of the speech signal at different frequencies at 
a given time is expressed by the darkness of the shading. The formants show up 
as darker bands in the spectrogram. [i] and [u] which are both close vowels have a 
low first formant (F1, the lowest formant), while the open vowels [ze] and [a] have a 
higher F1. The front vowel [i] has a high frequency for F2, while the back vowel [u] 
has a very low F2. The F2 values for [ze] and [a] are in between these two extremes. 

A two-dimensional graph, Figure 2.4, where the two dimensions represent tongue 
advancement and height has been used by phoneticians to display the main para- 
meters of vowel production. Joos (1948) showed that vowels plotted in an acoustic 
plane, where F2 is represented on the horizontal axis and F1 on the vertical axis 
with the origin in the upper right corner, leads to a relatively similar configuration 
as the articulatory graph. This parallel between the articulatory and acoustic planes 
is indicated with the arrows in Figure 2.4. However, the correspondence between 
formants and vowel height and advancement is not perfect. Rosner & Pickering 
(1994, 46-48) show that altering the three articulatory parameters constriction size, 
constriction location and mouth opening in vowels does not have any one-to-one 
correspondence with formants, but both the F1 and the F2 values are influenced by 
the change in any of the articulatory parameters. 

The third formant also plays a role in vowel categorization, but this is less well 
understood than F1 and F2. The importance of F3 seems to vary across languages, 
and for some languages the first two formants are enough for listeners to identify all 
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Figure 2.3. Vowel spectrograms of [i], [ze], [a] and [u]. The dashed lines mark the 
approximate positions of the formants. 
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Figure 2.4. Vowel quadrilateral (vowel chart of the International Phonetic As- 
sociation) with formant affiliation. Fl and F2 (Hz) with values of a typical male 
voice. 
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vowels (for a discussion of this see Adank, 2003, 38-39). However, phoneticians warn 
against using only Fl and F2 in a routine manner since some valuable data present 
in third formant might get lost. Fujimura (1967) found that for the identification of 
Swedish front rounded vowels F3 seems to be important. 

In Figure 2.3 the three first formants can be most clearly distinguished in the 
case of [ze]. The vowel [i] has a large distance between F1 and F2, while F2 and F3 
are so close to each other that they are hard to distinguish. In the cases of [a] and 
[u] F1 and F2 are very close to each other. 

Linear predictive coding is the most commonly used computational method for 
estimating formant frequencies from a speech signal (Rosner & Pickering, 1994, 8- 
11). A problem with algorithms for automatically determining formant frequencies in 
a spectrum is that they always give some false measurements. Formants very close to 
each other (like F1 and F2 in [u]) sometimes cannot be separated by the algorithms, 
and on the other hand the algorithms sometimes find false formants in the gap 
between formants if two adjacent formants have a large distance (like Fl and F2 in 
[i]). In a study of Swedish vowels by Eklund & Traunmiiller (1997), where formants 
were first measured automatically and subsequently checked manually, corrections 
had to be made for approximately a quarter of the vowel segments. Likewise, Adank, 
Van Hout, & Smits (2004) report that the automatic formant measurements of 20- 
25% of the vowels in a Dutch study had to be corrected manually. 

Formant measurement has been used for measuring vowel quality in a number 
of large-scale studies of regiolects and sociolects. Labov, Ash, & Boberg (2005) 
described regional varieties of North American English by measuring the first two 
formants of on average 300 vowel tokens of 439 speakers. Adank, Van Hout, & Van de 
Velde (2007) investigated regional differences in vowel pronunciation in Standard 
Dutch based on measurements of duration and formant frequencies of 15 vowels 
of 160 Dutch and Flemish speakers. Clopper & Paolillo (2006) analyzed formant 
frequencies and duration of 14 American English vowels as produced by 48 speakers 
from six dialect regions. However, the need for manual correction of the data makes 
formant measurements a very time-consuming method when working with data sets 
including a large number of speakers. 


2.4.2 Whole-spectrum methods 


Besides formant-based approaches to measuring vowel quality whole-spectrum meth- 
ods are also used. These are sometimes preferred since they include more acoustic 
information than formant frequencies. Moreover, whole-spectrum methods can gen- 
erally be more reliably automated than formant analysis, which makes them more 
suitable for fast analysis of large amounts of data. 

In the 1960s, Dutch researchers introduced principal component analysis (PCA) 
of band-pass-filtered® spectra as a method for identifying vowels acoustically (Plomp, 


° Applying an acoustic filter means that sound energy above and/or below a given frequency in 
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Pols, & Van de Geer, 1967). Vowel spectra were band-pass filtered with s-octave 
filters, roughly corresponding to the ear’s critical bandwidth. The vowel spectra 
were filtered up to 10,000 Hz, which, after combining the lowest filters in order to 
reduce the effect of fundamental frequency and for even better correspondence with 
the critical bandwidth, resulted in each vowel being described by the sound-pressure 
levels in decibels in 18 pass bands. PCA was used to find the linear combina- 
tions of the pass bands in the filter bank representation that would explain most 
of the variance in the data. The two first principal components (PCs) turned out 
to explain most of the variance (68.4%), while extracting up to four PCs signific- 
antly improved the amount of explained variance (84.1%). The PCs were shown 
to correlate highly with perceptual dimensions used by listeners to identify vowels 
(Pols, Van der Kamp, & Plomp, 1969). The method was mainly seen to be a use- 
ful application within automatic speech recognition. An advantage of a filter bank 
representation over measuring formants is that using band-pass filtering and PCA is 
much faster than formant analysis and does not require manual correction. The two 
first PCs correspond well with the two first formants. Correlating average formants 
values of twelve vowels pronounced by 50 male speakers showed very high correlation 
(r = 0.989 and r = 0.993) with the two first PCs after an optimal rotation of the 
PC plane (Pols, Tromp, & Plomp, 1973). However, correlating formants with the 
PC values of the individual speakers showed significantly less correspondence. In an 
automatic recognition task vowels were identified about equally well using formants 
and PCs (Pols et al., 1973). 

Van Nierop, Pols, & Plomp (1973) compared the results of a PCA on band-pass 
filtered spectra of twelve Dutch vowels by 25 female speakers to the earlier results 
by Klein, Plomp, & Pols (1970) and Pols et al. (1973) based on 50 male speakers 
pronouncing the same vowels. The eigenvectors of the two first components were 
rather similar for the female and male data, but the peaks in the curves of the female 
analysis occurred about 3-octave higher than the corresponding curves for the male 
data. The extracted components were not identical for the female and male data, 
but a small rotation of one of the two configurations showed that there was almost 
a complete convergence of the two solutions after rotation. 

Jacobi (2009) applied PCA to band-pass filtered spectra in a study of variation 
in Dutch diphthongs and long vowels (/ei/, /au/, /oey/, /o:/ and /e:/) among 70 
speakers (35 female and 35 male). Her aim was to find out if the sub-phonemic 
variation in these vowels could be related to socio-economic status and the age of 
the speakers. Band-pass filtering was chosen over formant analysis, since it can be 
fully automated without the need for manual correction of errors and yet, through 
high correlation with formants, offers interpretability in terms of articulatory and 
perceptual attributes. However, applying the method introduced by Plomp et al. 
(1967) to a variational linguistic study presented some challenges. The experiments 
of the Dutch researchers in the 1960s and 1970s were done in controlled environments. 


a sound spectrum is filtered away. The part of the spectrum remaining is called a pass band. A 
band-pass filter can be characterized by its center frequency and bandwidth. See Johnson (2003, 
14-17). 
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Jacobi’s data, on the other hand, consisted of spontaneous speech from the Corpus 
Gesproken Nederlands (= Corpus of Spoken Dutch) in extremely varying recording 
situations. Plotting the speakers /a/—/i/—/u/ vowel triangles in the PC2/PC1 plane 
showed a remarkable variability between speakers in the positioning of the triangles. 
The differences could be traced back to the signal-to-noise ratios in the recordings, 
which varied from interviews in silent environments to private conversations and 
broadcast recordings with music in the background. It turned out that the lower 
the signal-to-noise ratio (that is, the more noise), the smaller the measured vowel 
space in the PC plane and the higher the PC values. Vowels produced by men 
and women were analyzed together in the PCA. Some significant differences in the 
PC values between the sexes were found, but these were small compared to the 
differences depending on the varying recording situations. In order to normalize for 
the differences in position and size of the vowel spaces of different speakers, relative 
PC values were used: the positions of all vowels in the PC plane were related to 
the speaker’s point vowels /i/ and /a/.® In order to enhance the interpretability 
of the PCA solution Jacobi used only point vowels to build up the PCA. When 
an equal number of all point vowels are used in the analysis phase of the PCA, all 
articulatory-acoustic dimensions are represented equally and all possible differences 
in vowel quality are accounted for. The results can then be used to represent all 
other vowels within the vowel space. 

For the band-pass filtering, Jacobi used Bark filters instead of z-octave filters. 
The Bark scale is a psycho-acoustical scale corresponding to the critical bandwidth 
of human hearing, which means that a representation in Bark filters should corres- 
pond to how humans perceive vowel sounds. The Bark scale is roughly linear up to 
1000 Hz and roughly logarithmic at higher frequencies. Jacobi found high correla- 
tions between PCs and formants (see § 5.2.1) and concluded: “pel and pc2 of a PCA 
on barkfiltered /a/, /i/, /u/ yielded comparable results to FlBark and F2Bark, and 
are thus easily interpretable in terms of articulation” (Jacobi, 2009, 42). 

In automatic speech recognition Mel frequency cepstral coefficients (MFCCs) are 
widely used today for identifying speech sounds. Like the Bark scale, the Mel scale 
is an auditory scale (a doubling of Mels corresponds to the doubling of perceived 
pitch). MFCCs are obtained by applying discrete cosine transformation to Mel- 
scaled bandpass filters (Harrington, 2010). In a study exploring cluster analysis as 
a method for classifying dialects Lundberg (2005) applied MFCCs to vowel data 
from 285 subjects from 95 sites in the Swedish language area (see also § 2.2.3). A 
reduction of twelve MFCCs to two dimensions by multidimensional scaling showed 
a configuration similar to the IPA vowel quadrilateral (Lundberg, 2005, 26). 

For the present study PCA on Bark filtered vowel spectra was chosen for ana- 
lyzing vowel segments acoustically. The main reason for choosing a whole-spectrum 
method instead of measuring formants is that the whole-spectrum methods have 
been shown to be more reliably automatable than formant measurements. Choosing 
a method which does not need manual correction was important, because the data 


®Because /u/ was the most sensitive point vowel to the signal-to-noise ratio only /i/ and /a/ 
were used for calculating the relative positions. 
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for the present study includes nearly 1,200 speakers (see Chapter 4). MFCCs could 
have been chosen instead of Bark filters and PCA, and might yield similar results. 

The method and its application to the present data set is described more closely 
in Chapter 5. In addition to band-pass filtering, formant analysis was applied to a 
smaller subset of the data in order to validate the method and to find the optimal 
PCA configuration (see § 5.2). 


2.4.3 Speaker-dependent variation in vowels 


A challenge for all acoustic methods is the speaker-dependent differences in speech 
sounds due to the different size and shape of the vocal tract of speakers. Two vowels 
uttered by two different speakers can have very similar formant frequencies while 
they are perceived as belonging to different phonemes by listeners of the languages. 
And, conversely, two vowels sounding the same to listeners might differ in formant 
frequencies. Generally a smaller vocal tract generates higher formant frequencies. 
This is why children have higher formant frequencies than adults and women gen- 
erally higher than men (Peterson & Barney, 1952). But also within age and gender 
groups individuals show a lot of variation. 

Figure 2.5 shows an example of the inter-speaker variability in formant meas- 
urements. In the graphs the formant frequencies in Bark of ten Swedish long vow- 
els produced by three adult speakers, two female (yw_1 and yw_ 3) and one male 
(ym_ 2), are plotted. The speakers all speak the same dialect, the dialect of Malung. 
The data come from the data set discussed in § 5.2 below. It is evident that the 
vowel spaces vary both in size and position in the F2/F1 plane (scales are reversed 
in order to resemble the articulatory vowel quadrilateral). If looking at the absolute 
formant frequencies in Bark, the [u:] in sot of speaker yw_1 is closest to the [o:] 
in lads of ym_3. The [ee] (lar) of the male speaker (ym_ 2) is closest to the [cer] 
(dér) of the two female speakers. However, the relative positions of the vowels in all 
speakers’ vowel spaces resemble each other a lot. 

Inter-speaker differences, like the ones present in formant measurements, are 
usually found also when using whole-spectrum methods for vowel analysis. 

The mechanisms behind the speaker normalization that listeners perform are 
not fully understood. The problem has been viewed both from a vowel-extrinsic and 
from a vowel-intrinsic perspective. Extrinsic adaptation means that listeners adapt 
to every speaker’s vowel system as a whole, while intrinsic adaptation means that 
the information needed for normalization is included in each segment itself. Intrinsic 
factors have been sought in distances between fundamental frequency and higher 
formants. Fundamental frequency (FO) seems to be an important cue for identifying 
vowels in mixed speaker conditions as shown in experiments by Nusbaum & Morin 
(1992), Eklund & Traunmiiller (1997) and Halberstam & Raphael (2004). 
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Figure 2.5. The vowel spaces of three speakers of the same dialect (Malung). Ten 
different long vowels are plotted in the F1-F2 plane. The words used for eliciting 
the vowels are displayed: dis [i:], dér [oe:], lat [a:], leta [e:], lus [w:], lds [o:], lar [eer], 
sot [u:], sdt [o:], typ [y:]. 
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Table 2.3. Division of formant-based speaker normalization procedures into four 
groups. The division was introduced by Adank (2003). 
Formant-intrinsic Formant-extrinsic 
scale transformations | Syrdal & Gopal (1986) 
(e.g. Bark, Mel) 
Gerstman (1968) Nordstrém & Lindblom (1975) 
Vowel-extrinsic | Lobanov (1971) Miller (1989) 
Neary (1977) 


Vowel-intrinsic 


2.4.4 Speaker normalization 


A wide range of different normalization procedures for vowels have been suggested in 
different studies. The choice of a normalization procedure depends crucially on the 
aim of the normalization. For example, in automatic speech recognition systems the 
aim is to remove all other kinds of variation but the phonemic. For sociolinguistic 
and dialectological research on the other hand we are interested in also maintaining 
sub-phonemic variation. The aim of speaker normalization procedures has been 
described by Disner (1980) as follows: “they should maximally reduce the variance 
within each group of vowels presumed to represent the same target when spoken by 
different speakers, while maintaining the separation between such groups of vowels 
presumed to represent different targets”. 


2.4.4.1 Formant-based normalization procedures 


Adank (2003) divided formant-based normalization procedures into formant-intrinsic 
and formant-extrinsic, as well as vowel-intrinsic and vowel-extrinsic (Table 2.3). 
Vowel-intrinsic means that the normalization can be done using information present 
in a single vowel token, while vowel-extrinsic means that the whole vowel space of 
a speaker, or at least the point vowels, are taken into account when performing 
the normalization. Formant-intrinsic means that each formant can be normalized 
without knowledge of higher or lower formants (including F0). Formant-extrinsic 
methods use information across formants, for example, relative distances between 
formants (including FO). 

A vowel-intrinsic /formant-—intrinsic method is basically a transformation of form- 
ant values according to some scale. The Hertz scale that is used for measuring 
frequencies is a linear scale. Bark and Mel are examples of auditory scales, which 
correspond better to human perception, which is roughly linear up to 1000 Hz and 
roughly logarithmic at higher frequencies. Transforming the measured Hertz values 
to an auditory scale improves the correspondence between acoustic measures and 
perception. 

A vowel-intrinsic /formant-extrinsic procedure was used by Syrdal & Gopal (1986). 
In this method the data was transformed by measuring the distances in Bark between 
adjacent formants. 

Several methods use information across vowel categories for speaker-normaliza- 
tion (vowel-extrinsic methods). Gerstman (1968) introduced a method for normaliz- 
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ing on the basis of the highest and lowest F1 and F2 values of each speaker. Lobanov 
(1971) did a z-transformation of the F1 and F2 values of every speaker. Neary (1977) 
applied a log transformation to the Hertz frequencies and centered the formant val- 
ues of every speaker. In Neary’s approach the center of every speaker’s vowel spaces 
was moved to zero, but varying sizes of the vowel spaces of different speakers were 
not normalized for. Labov (1994) has used Neary’s log mean transformation in his 
sociolinguistic studies. 

A problem with the vowel-extrinsic methods is that they assume that either the 
average values or the extreme values of all speakers are linguistically stable. The 
approaches of Lobanov (1971) and Neary (1977) imply that all speakers use the same 
phoneme system and that data of all vowel categories are present for all speakers. 
Otherwise the average values of different speakers would not be comparable. In the 
procedure by Gerstman (1968) a reliable transformation is possible only if the vowel 
phonemes representing the highest and lowest F1 and F2 values are linguistically 
stable without sub-phonemic variation. 

The vowel-extrinsic /formant-extrinsic normalization procedures use information 
across vowel tokens and across formants. In addition to being vowel-extrinsic /formant- 
extrinsic the normalization procedure by Nordstrém & Lindblom (1975) is speaker- 
extrinsic. In their study Nordstrém & Lindblom (1975) started by estimating the 
average length of vocal tracts of men and women by calculating average F3 values of 
open vowels within each sex group. Because the length of the vocal tract correlates 
highly with formant frequencies, a scale factor based on the ratios between average 
male and female vocal tract lengths (as estimated by F3) could be used for trans- 
forming the formant frequencies. This procedure does not remove speaker-dependent 
differences within the two sex groups, but it normalizes for the systematic differences 
between male and female voices. 

The normalization procedure of Miller (1989) is based on distances between form- 
ants, hence, formant-extrinsic. Furthermore a speaker-specific anchor point based 
on average FO values is used, which makes it vowel-extrinsic. 


2.4.4.2 Normalization in whole-spectrum approaches 


A centering of data, corresponding to the procedure of Neary (1977), was used on 
bandpass-filtered vowel data by Pols et al. (1973). Jacobi (2009) applied centering 
to principal components of bandpass-filtered data, but decided on using another 
method for speaker normalization. The point vowels /i/ and /a/ were considered 
stable point vowels in the Dutch data in Jacobi’s study. Hence, positions relative 
to each speaker’s /i/ and /a/ were calculated for all vowels, which accounted for 
differences in the size and position of the vowel spaces of different speakers. 

Both the method of Pols et al. (1973) and the one of Jacobi (2009) are vowel- 
extrinsic and, therefore, depend either on identical phoneme systems or stable point 
vowels in the data. 
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2.4.4.3 Evaluations of normalization procedures 


When comparing different vowel normalization procedures Disner (1980) found that 
Neary’s log-mean procedure was best at reducing scatter within languages. But a 
crucial result from her study was that most of the procedures tested performed quite 
poorly when comparing vowels from different languages, and some procedures even 
reversed the linguistic trends. Procedures like those of Neary (1977) and Lobanov 
(1971), which use mean values and/or standard deviations of formants for normal- 
izing, fail in comparing language varieties with different phoneme systems because 
the means and standard deviations of these systems are not comparable. These 
procedures are good at indicating the relative position of vowels within the phonetic 
space of one language, but the relative positions do not represent the positions in a 
universal phonetic space. 

One of the few procedures that Disner (1980) found to be valid for cross-language 
comparison was the PARAFAC procedure of Harshman (1970), which can be used 
for isolating and then averaging speaker-dependent differences. A problem with 
PARAFAC, however, is that it assumes a priori knowledge of the phoneme cat- 
egory of each vowel token. Hence, it cannot be used for data sets where manual 
categorization is not done prior to the acoustic normalization. 

Adank (2003) evaluated a number of speaker normalization procedures, among 
others, the ones in Table 2.3 (p. 31), with the criterion that a successful normal- 
ization procedure should preserve phonemic variation and sociolinguistic speaker- 
related variation, but minimize the anatomical/physiological speaker-dependent vari- 
ation. In within-language comparison, that is, when comparing varieties with the 
same phoneme system, the vowel-extrinsic/formant-intrinsic normalization proced- 
ures perform quite well. Adank (2003) found that Lobanov’s (1971) procedure was 
most effective in reducing speaker-specific variation while maintaining sociolinguistic 
variation. The vowel-intrinsic procedures performed poorly in Adank’s evaluation 
and the formant-extrinsic procedures performed worse than the formant-intrinsic 
ones. Adank’s results support the results of Disner (1980), and Adank (2003, 183) 
strongly emphasizes that “it is not advisable to carry out normalization procedures 
on data sets that are not fully phonologically comparable”. 

In studies of formant frequencies that include languages or dialects with differ- 
ent phoneme systems or deviating vowel spaces, speaker normalization remains a 
problem. A common solution when wanting to compare formant values of differ- 
ent varieties is to use raw formant measurements, but to average over a number of 
speakers (for example, Adank et al., 2007). Averaging over a sufficient number of 
speakers will reduce the effect of varying sizes of vowel spaces of individual speakers. 
However, this method can be used only when one can assume that there are no 
systematic differences in the size and shape of the vocal tracts across the different 
groups of speakers. Female and male speakers cannot be directly compared to each 
other by using within-group averages, because of the systematically lower formant 
frequencies of men than women. Neither was using group averages an option in a 
study by Yang (1996) where American English and Korean vowels were compared, 
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because there was a a considerable difference in vocal tract length between American 
English and Korean speakers. In order to be able to normalize for the differences 
in vocal tract length between the speaker groups, Yang (1996) estimated the vocal 
tract length by applying the method of Nordstrém & Lindblom (1975), originally 
developed for normalizing for the differences between male and female voices (see 
§ 2.4.4.1). 

The varieties analyzed in the present study are not fully phonologically com- 
parable, which makes it impossible to use most of the normalization procedures 
mentioned above. Neither can stable point vowels be found in the data, which ex- 
cludes the possibility to use a normalization procedure comparable to the one used 
by Jacobi (2009). Using raw acoustic measurements and averaging over groups of 
speakers is complicated by the small number of speakers per variety. At each site, 
around twelve speakers were recorded representing both sexes and two generations 
of speakers (see § 4.3). The number of male and female speakers is not constant 
across all sites and vowels, and considerable variation across the two age groups can 
be expected (see § 2.1) making a comparison of the two age groups necessary, which 
reduces the number of speakers per variety. 

For these reasons, a solution comparable to the one by Nordstrém & Lindblom 
(1975) was sought. That is, the aim was not to try to remove all speaker-dependent 
variation, which would be virtually impossible, but to normalize for the systematic 
differences between male and female voices. The systematic differences in vowel 
spectra produced by men and women found by Van Nierop et al. (1973) (see § 2.4.2) 
was used as a basis for the normalization. The procedure is described in Chapter 5. 


2.5 Dialect geography and dialectometry 


In dialectology, regional differences in language use are studied. Dialect geography, 
more precisely, maps geographic distributions of dialectal features. The first large- 
scale dialect geographic project was the one by George Wenker in Germany in 1876. 
Questionnaires were completed by about 45,000 schoolmasters in Germany, and 
based on the survey Wenker drew maps by hand, each map representing a dialectal 
feature. This project was followed by many dialect atlas projects in Europe and 
North America in the first half of the 20th century. The usual way to present the 
data in dialect atlases was creating display maps representing one single feature. In 
secondary studies based on data from dialect atlases interpretive maps, grouping 
variants into the most predominant groups, are also found. 

The traditional method of identifying dialect areas has been the isogloss method. 
The term isogloss was introduced by J. G. A. Bielenstein in 1892 (Chambers & 
Trudgill, 1998, 89). An isogloss is a line on a map that is drawn between locations 
where speakers use different variants of a feature. When many isoglosses are drawn 
on the same map some patterns can usually be identified. When many isoglosses co- 
incide they form an isogloss bundle, which usually indicates a major dialect division. 
A well known isogloss bundle from Swedish dialectoloy is the one distinguishing 


2.5. Dialect geography and dialectometry 35 


South Swedish dialects (Andersson, 2007, 42). South from the isogloss bundle a 
dorsal /r/ is used, while apical /r/ is predominantly used in other Swedish varieties. 
Other isoglosses coinciding closely with the /r/ isogloss are the lack/use of retroflex 
consonants and the use of a retroflex flap for /rd/. 

The problem of identifying dialect areas by means of isoglosses has been formu- 
lated by Chambers & Trudgill (1998, 96-97) as follows: 


‘Tt is undeniable that some isoglosses are of greater significance than 
others, in the sense that some mark distinctions ‘felt’ to be culturally 
important while others do not, some persist while others are transitory, 
and the like. It is equally obvious that some bundles are more significant 
than others, in the same sense. Yet in the entire history of dialectology, 
no one has succeeded in devising a satisfactory procedure or a set of 
principles to determine which isoglosses or which bundles should outrank 
some others. The lack of theory or even a heuristic that would make this 
possible constitutes a notable weakness in dialect geography.” 


In the middle of the 20th century dialect geography as an international discipline 
declined. One of the reasons was that proper methods for analyzing the huge data 
collections in the dialect atlases were lacking (Chambers & Trudgill, 1998, 20). A 
revitalization started in the 1980s when computational methods offered new possib- 
ilities for analyzing the data. By this time sociolinguistics had also developed a new 
theoretical framework for analyzing linguistic variation. 

When using isoglosses for dividing language areas into dialect regions, the choice 
of which linguistic features to emphasize is subjective and different researchers are 
likely to make different choices. As a more objective alternative to the isogloss 
methods Séguy (1973) introduced dialectometry (Fr. dialectométrie). Seguy worked 
with data from Atlas linguistic de la Gascogne. Instead of only drawing isoglosses 
based on single features, Séguy started by calculating dissimilarity scores between 
dialects based on all available linguistic features in the atlas. The linguistic distances 
between adjacent dialects were plotted on maps using thicker or thinner lines indic- 
ating, respectively, larger or smaller distances. The main dialect borders could be 
identified by the thickest lines. At the same time the maps showed that the dialect 
landscape is continuous without many abrupt borders. 

Goebl (1982, 1984, 2006) adapted Séguy’s ideas and started using computational 
methods for calculating linguistic distances between varieties. Goebl extended the 
dialectometric idea by not only calculating the similarity between geographically 
adjacent dialects, but between all varieties in a data set. The result is an xn 
similarity matrix, where n is equal to the number of dialects in the data set. Like 
Séguy’s data, Goebl’s data originates from dialect atlases (Atlas linguistique de la 
France and Atlante italo-svizzero). The similarity between two dialects is calculated 
by counting for how many features in the data set two dialects use the same variant 
and for how many features they differ. The percentage of similar variants is the 
similarity measure. In Goebl’s method the linguistic similarity is plotted on maps 
by choosing one reference site and by a color scheme displaying the similarity with 
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all other sites (thus using only one vector with n values from the n x n similarity 
matrix). With this approach n different maps can be created of a data set with 
n sites. Each map shows how the linguistic landscape looks from the view of the 
speakers of one specific dialect. A problem with this approach is that in a data set 
with a large number of data sites a large number of maps can be created, which 
complicates interpretation. 

Goebl also introduced cluster analysis as a dialectometric method. In cluster 
analysis the distances between all dialects are analyzed and a partitioning is made 
of the data so that the most similar dialects will belong to the same group. With 
cluster analysis dialect areas can be detected and the distances are not observed 
from one specific site, but the perspective is that of an “objective observer”. 

Kessler (1995) introduced the Levenshtein distance (also called string edit dis- 
tance) as a tool for measuring dialect distances. This is a more refined measure than 
the binary counts of Séguy and Goebl, since phonetic similarity of segments can 
be taken into account (for a detailed description, see Nerbonne & Heeringa, 2010). 
Gooskens & Heeringa (2004) validated phonetic distances between Norwegian dia- 
lects measured with the Levenshtein distance with perceptual distances and found 
a high correlation. Other methods for aligning transcriptions and measuring dia- 
lect distances have been proposed as well, for example, Pair Hidden Markov Models 
(Wieling, Leinonen, & Nerbonne, 2007) and an iterative pairwise aligning algorithm 
used to produce multiple sequence alignments (Proki¢, Wieling, & Nerbonne, 2009). 

Heeringa (2004) applied the Levenshtein distance to Dutch and Norwegian dia- 
lect data. In addition to applying cluster analysis to the n x n distance matrices 
with the aggregate linguistic distances between all varieties, Heeringa used multi- 
dimensional scaling (MDS). MDS is a technique for visualizing pair-wise distances 
in a low dimensional space. Based on the pair-wise distances, positions in a low- 
dimensional space that approximate the original distances can be calculated (see 
also § 7.1 below). While Goebl’s method for visualizing distances between dialects 
allows visualization of distances only from one reference point at a time, the advant- 
age of MDS is that the relative linguistic distances between all varieties are displayed 
simultaneously. MDS has been used in linguistics since Black (1973). The results 
of MDS are usually visualized in two-dimensional Cartesian coordinate systems. In 
the coordinate system the positions of lects reflect the linguistic distances instead of 
showing geographic distances like amap. Chambers & Trudgill (1998, 147) comment 
on the use of MDS in dialectology: 


“One possible objection to multidimensional scaling is that it eliminates 
geographic distance in favor of statistical distance, so to speak. However, 
it is the difference between the two types of distances that proves to be 
one of the most telling aspects of the analysis.” 


When MDS is applied to linguistic data three dimensions usually explain so much 
of the total variance in the data that scaling to more than three dimensions is not 
considered necessary. Wilbert Heeringa and Peter Kleiweg’ found a way to map 


“Or, actually, both of them give the credit for coming up with the idea to the other (Nerbonne, 
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the results of three-dimensional MDS to geography by using the RGB color model 
(Nerbonne, forthcoming). The method is explained more closely in Appendix B of 
this thesis. By using the three-dimensional color scheme for coloring maps, a three- 
dimensional linguistic space is linked to the two-dimensional geographic space. This 
facilitates the interpretation of results obtained by MDS. The method is used for 
visualizing dialect distances throughout Chapter 7 in the present thesis. 

In the dialectometric work by Heeringa and Nerbonne (e.g. Heeringa, 2004; Ner- 
bonne & Siedle, 2005) cluster analysis and MDS have been used side by side. While 
cluster analysis detects dialect areas, the results of MDS show that the dialectal 
variation within the areas and at the boundaries is actually continuous and that the 
borders are not as abrupt as suggested by clustering. These complementary analyses 
are in line with how traditional scholars have dealt with geographic variation. At 
the same time as researchers agree that linguistic variation is gradual, they have 
found it important to group dialects into larger areas and describe which the most 
prominent dialectal features are for the distinguished areas. 

Recent research has shown that cluster analysis should be applied with caution 
to dialect data (Nerbonne, Kleiweg, Heeringa, & Manni, 2008; Proki¢ & Nerbonne, 
2008). Small differences in the input data can lead to substantially different clus- 
tering results. Because of this, the results of different clustering algorithms should 
be compared and the results should be carefully evaluated. In data that is truly 
continuous clustering algorithms are unlikely to find meaningful clusters. 

A different approach to detecting dialect areas was proposed by Hyvoénen, Leino, 
& Salmenkivi (2007) and Leino & Hyvénen (2008). They worked with data very dif- 
ferent from the data in the previously mentioned studies. The data comprised lexical 
distribution maps from the Dictionary of Finnish Dialects and was binary: either a 
lexical item had been recorded at a municipality or it had not. The data suffered 
from uneven sampling; some municipalities had been thoroughly sampled, while the 
data from other places was sparse. Hence, the absence of a record of a lexical item at 
a site meant either that it was not used in that dialect or that it just did not happen 
to be recorded. While cluster analysis and MDS did not perform well on the data, 
different component models (factor analysis, non-negative matrix factorization, as- 
pect Bernoulli, independent component analysis and principal component analysis) 
detected distribution patterns corresponding to dialect areas. With these methods 
it was also possible to factor out the effect of uneven sampling. These methods do 
not make sharp divisions into dialect areas like cluster analysis, but show core areas 
and transitional zones. The five different component methods compared by Leino 
& Hyvonen (2008) all presented the data in slightly different ways. A conclusion of 
the study was that factor analysis was the most stable method producing the most 
easily interpretable results. 

The dialectometric research tradition which started with Séguy and was contin- 
ued by, among others, Goebl, Heeringa and Nerbonne, has focused on the aggregate 
analysis, that is, the picture that emerges when all available data is considered (Ner- 


forthcoming). The method was first used by Nerbonne, Heeringa, & Kleiweg (1999). 
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bonne, 2009). This was a reaction against the isogloss method, which analyzes only 
one variable at a time or a limited number of isoglosses at best. However, a drawback 
of the aggregate measures of linguistic similarity /dissimilarity used in dialectometry 
is that it is hard to trace back the linguistic features characterizing the linguistic 
areas that have been detected in the aggregate. Moreover, the aggregate distances 
can hide varying distributions of dialectal features in the original data. 

Sometimes different distribution patterns can reveal more about the causes of 
dialectal variation than an aggregate analysis does. For example, in Scandinavian 
dialectology two centers have traditionally been identified from which novel features 
have spread: a southern and a central one. Features that have spread from the 
south have a different distribution than features that have spread from the center, 
and the influence of these two centers can be traced back to different time periods 
(Pettersson, 2005). While concentrating on the aggregate analysis, dialectometric 
methods are likely to ignore the different underlying distribution patterns below the 
aggregate level, like the two different innovation centers in Scandinavia. Aggregate 
analysis gives a view of the relationships between dialects, but in order to explain 
the relationships the diffusion patterns are important. 

Some attempts have been made to identify linguistic structure in the aggreg- 
ate analysis. Proki¢ (2006) applied aggregate dialectometric analysis to Bulgarian 
dialect data and, in addition, extracted the most frequent regular sound correspond- 
ences between dialects from the same data set. She found that out of the ten most 
common regular correspondences eight were correspondences between two vowels or 
insertions /deletions of vowels, which suggests that the vowels are largely responsible 
for the classification attained by the aggregate analysis. For each of the regular 
sound correspondences a map could be created showing the geographic distribution. 

Nerbonne (2006) applied factor analysis to vowel data from the Linguistic Atlas 
of the Middle and South Atlantic States, which contains transcribed lexical items. 
Vowels were identified by the word they were extracted from (for example, the first 
vowel in the word “afternoon’). Factor analysis revealed which vowels could explain 
most of the variance and also which vowels had similar distributions. However, 
manual investigation of the data was needed in order to identify the most important 
variants of the vowels involved. 

Factor analysis was also used by Clopper & Paolillo (2006) to analyze formant 
frequencies and duration of 14 American English vowels as produced by 48 speakers 
from six dialect regions. The analysis showed regional patterns and co-occurrence of 
some vowel features, but the analysis was complicated by interactions with speaker- 
Sex. 

While the essential difference between dialectometry and traditional dialect geo- 
graphy from the start has been the focus on aggregate analysis in dialectometry, 
the word dialectometry literally means ‘measuring dialect’. Literally dialectometry 
could, thus, include any quantitative/computational analysis of dialects. Methods 
first introduced in dialectometry have also spread to research that does not include 
an aggregate approach (for example, studies mentioned in § 2.2.3). As large digit- 
alized speech corpora become available for analysis, the benefits from using statist- 
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ical, multivariate methods become obvious. These methods can find relationships in 
data sets which are too large and complex for manual analysis. Besides being able 
to handle large amounts of data, data-driven methods also make the analysis more 
objective. Often patterns are recognized that were not expected by the researcher 
(Chambers & Trudgill, 1998, 141). Nonetheless, the analysis is of course only as ob- 
jective as the input, and the data for statistical analysis should be carefully chosen 
and interpreted. It should also be noted that the methods used in dialectometry are 
usually exploratory and not confirmatory. They can be used to describe data, but 
not to test hypotheses. 

In the present thesis, factor analysis (FA) and multidimensional scaling (MDS) 
were applied in order to explore the dialectal variation in Swedish vowel pronunci- 
ation. Leinonen (2008) showed that, in contrast to cluster analysis and MDS, FA 
is able to detect different diffusion patterns in dialect data and find co-occurring 
features. In the paper mentioned, vowel quality in Swedish dialects was measured 
at the temporal midpoint of each vowel segment and only geographic variation was 
analyzed. As described in § 2.3, diphthongization is an important characteristic 
for regional varieties of Swedish. Therefore, the present study extends the analysis 
of Leinonen (2008) by adding spectral change as a variable in FA. Moreover, not 
only geographic variation, but also within-site variation is studied in this thesis. A 
further aim was to compare different levels of aggregation. Prior to applying FA 
and MDS, the variation in each variable is analyzed separately. FA represents an 
intermediate step of aggregation, where variables with similar geographic and/or 
social distribution patterns are bundled together, while MDS gives the aggregate 
view that emerges when all variables are considered simultaneously. The relation- 
ship between the three different levels of aggregation and the extent to which the 
analyses complement each other is explored in this thesis. 


Chapter 3 
Aims and research questions 


The main aim of this thesis was to describe the geographic variation in vowel pro- 
nunciation across the Swedish language area. As described in §§ 2.2 and 2.3 both 
rural Swedish dialects and regional varieties of Standard Swedish vary a lot when 
it comes to vowel pronunciation, and vowels have been important for characterizing 
varieties of Swedish and classifying dialects. Still, no exhaustive acoustic description 
of variation in Swedish vowel pronunciation exists (Bruce, 2010, 103). 

By carrying out an acoustic analysis of vowels from a large number of varieties 
of Swedish, I hoped to be able to answer the following questions: 


1. How is the variation in Swedish vowel pronunciation distributed geographic- 
ally? 


2. Do different vowel features show co-variation? 


The Swedish dialects have undergone big changes during the last century. General 
questions to answer include questions about the dialect situation around year 2000: 


3. How large is the dialectal variation and in which areas are divergent rural 
dialects still spoken? 


4. Which Swedish dialects are changing? Which are stable? 


The data set analyzed in this thesis includes speakers of two generations, which 
made an apparent time study of language change possible. Based on the societal 
and linguistic changes described in § 2.1.2 the hypothesis was that the distances 
between dialects would be shorter in the younger generation of speakers than in the 
older generation. Relevant questions were: 


5. How much change in vowel pronunciation can be observed between older and 
younger speakers? 


6. Which vowels are changing and in what direction? 


Al 
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7. Which vowel features are stable? 


Gender is a social variable that has been shown to correlate with linguistic variation 
in many studies (Chambers & Trudgill, 1998, 61). The gender-related variation in 
vowel pronunciation was not studied in as much detail as the variation across age 
groups in this thesis, but at an aggregate level an answer was sought to the following 
question: 


8. Is there gender-related variation in Swedish vowel pronunciation? 


Principal component analysis of Bark filtered vowel spectra was chosen as a meas- 
ure of vowel quality for this thesis, since this approach is more reliably automatable 
than formant measurements. A representation in Bark filters gives a good percep- 
tual representation of vowels, because the Bark scale corresponds to the critical 
bandwidth of human hearing. Acoustic analysis of vowels is not unproblematic, as 
explained in § 2.4. The main problem for dialectological and sociolinguistic studies 
of vowel pronunciation is how to normalize for speaker variability related to the 
anatomy /physiology of speakers in order to be able to analyze linguistic differences. 
A large number of normalization procedures have been proposed (see § 2.4.4), but 
most of them depend on the varieties being compared sharing some common traits 
that can be used as a basis for the normalization. When no common denominators, 
like comparable mean values and standard deviations or common point vowels, exist 
normalization fails. 

The Swedish dialects show so much sub-phonemic and phonemic variation in 
vowels that the kind of common denominators mentioned above cannot be found for 
all dialects. The present study included a relatively small number of speakers per 
variety, and in addition the number of men and women was not equal in all speaker 
groups. Using pure group averages of the acoustic measures for reducing the influence 
of speaker-dependent variation would have been biased by the systematic differences 
in the vowel-spaces of men and women. A question related to the acoustic analysis 
of the vowels was: 


9. To what extent can speaker-dependent variation in the acoustic measures be 
reduced? 


Dialectometry has introduced aggregate analysis of dialectal variation as an alternat- 
ive to detailed analysis of separate variables. Aggregate analysis allows the researcher 
to find out how dialects relate to each other when all available data is considered 
simultaneously, instead of looking at individual features, which is what generally 
has been done in traditional dialectology. Methods commonly used for aggregate 
analysis of dialects are cluster analysis and multidimensional scaling. The prob- 
lem of how to identify linguistic structure in the aggregate has not been completely 
solved yet. A methodological aim for this thesis was to analyze the relationship 
between aggregate analysis and underlying distributions of individual features. In 
the paper Factor Analysis of Vowel Pronunciation in Swedish Dialects (Leinonen, 
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2008) I showed that factor analysis is an effective method for identifying linguistic 
features that show similar geographic distributions, and displaying these distribu- 
tion patterns on maps. By comparing the results obtained by factor analysis and by 
multidimensional scaling in this thesis I wanted to approach the questions: 


10. How can analysis on the variable level and aggregate analysis supplement each 
other in the study of dialectal variation? 


11. Can a comparison of variation on the variable level and aggregate analysis 
explain what kind of variation the aggregate analysis accounts for? 


As the aggregate analysis shows how varieties relate to each other when all variables 
are considered, it can provide a basis for a dialect classification. Questions for the 
aggregate analysis were: 


12. How can the Swedish dialects be classified based on vowel pronunciation? 


13. Does a classification of modern varieties of Swedish correspond to traditional 
divisions of Swedish dialects? 


The data for this study comes from the SweDia database (see Chapter 4). Within the 
SweDia project work has been carried out that encompass the same Swedish sites and 
speakers as the present study but other linguistic levels (see § 2.2.3). A comparison 
with these studies can give an account of the association between linguistic levels: 


14. Does a classification of Swedish dialects based on vowel pronunciation corres- 
pond to typologies based on other linguistic features? 


Answers to questions 1-7 are given in Chapter 6, where variation on the variable 
level is studied. Question number 1 is more specifically related to § 6.1 and question 
2 to § 6.3. Answers to questions 3-7 are found in all sections of Chapter 6. 

Chapter 5 describes the acoustic analysis of the vowels. Question number 9 is 
being dealt with more specifically in § 5.1.5. 

In Chapter 7 aggregate analyses are described and answers are given to questions 
8, 11 and 12. Also questions 3-5 are partially answered in Chapter 7. 

In Chapter 8, question number 10 is approached by comparing the results of 
Chapters 6 and 7. Questions number 13 and 14 are not subject to any quantitative 
analyses but are discussed in general terms in § 8.2.1. 


Chapter 4 


Data 


In this chapter the data set analyzed in this thesis is described. General information 
about the SweDia database where the data comes from is given in § 4.1. The specific 
data set and the vowels that were chosen for the analyses are described in § 4.2, and 
in § 4.3 the speakers are described in more detail. 


4.1 The SweDia Corpus 


SweDia 2000 (Eriksson, 2004a,b) was a project carried out as a joint effort between 
the Swedish universities of Lund, Stockholm and Umea. The aim was to document, 
analyze and describe the dialectal variation in the Swedish language area, with a 
special focus on phonetic and phonological descriptions. The project was financed by 
The Bank of Sweden Tercentenary Foundation and was carried out between 1998 and 
2003. Dialect data were recorded at 107 sites in Sweden and the Swedish-language 
parts of Finland. At each site recordings were made with approximately twelve 
speakers representing two generations. The older speakers were in the approximate 
age range of 55-75 years and the younger speakers of 20-35 years. An equal number 
of male and female speakers were recorded in each age group. Hence, each recording 
site was represented by three older women, three older men, three younger women 
and three younger men, with a few exceptions (see § 4.3). 

The sites for the recordings were chosen to represent the rural dialects, so that 
no speakers from cities or larger towns were included in the database. The motiv- 
ation for this was that traditional rural dialects tend to disappear in large parts of 
the Swedish language area and needed to be recorded before being completely lost 
(Eriksson, 2004b, see also § 2.1.2). Moreover, the language varieties in the cities have 
developed under different premises and have to be studied with different methods 
from those used for rural dialects. The language varieties in the cities have been 
influenced by large-scale immigration from the countryside and in the cities there is 
significantly more social and linguistic stratification than in rural areas (Nordberg, 
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2005). The locations for the database were chosen to represent the dialectal situation 
in the Swedish language area by being balanced geographically and with respect to 
population density. To enable diachronic comparison, locations that had been sub- 
ject to previous studies were favored when the geographic distribution allowed a 
choice between nearby locations. 

The database consists of two types of data: a) spontaneous speech, and 6) 
a controlled part for which specific phonetic and linguistic features were elicited. 
The spontaneous speech comprises free interviews with dialect speakers or dialogs 
between two dialect speakers. The controlled data focused on three specific areas 
of the dialects: 1) the sound system, 2) intonation and tone accents, and 3) the 
quantity system. 

The recordings were made with a lapel microphone and a portable DAT-recorder. 
The recordings were done at 48 kHz sample rate and 16-bit amplitude resolution. 
Before analysis the data were downsampled to 16 kHz/16 bit. 

The recordings were made in the speakers’ homes or other familiar places in order 
to make the participants feel comfortable and to make the use of the local vernacu- 
lar feel natural. A quiet room without much reverberation was chosen to ensure 
good quality recording. Living rooms with many soft surfaces were preferred over 
kitchens, which contain many hard surfaces that produce reverb. Two interviewers 
were generally present at the recording sessions, one of whom carried out the actual 
interview and the other one being responsible for the technical equipment. In many 
cases the interviewers were not speakers of the local vernacular, but spoke a regional 
dialect or the regional standard language representing the larger geographical region 
of the recording site. 

The fact that the interviewers are not speakers of the local vernacular might be 
a problem when recording dialect data. Especially in the Swedish language situ- 
ation where most speakers use code-mixing when varying between local, informal 
speech and more formal speech it is difficult to say to what extent the speech of 
the interviewers has influenced the speakers. In a study of the local vernacular of 
Burtrask in the province of Vasterbotten in the north of Sweden, Thelander (1979) 
recorded subjects in different speech situations and measured the use of dialectal 
versus standard forms of a number of morphological and morpho-phonological fea- 
tures. First, the local subjects participated in a free discussion with four locals. 
After one hour a “stranger” entered the discussion. He or she originated from the 
north of Sweden but not from Burtraésk County and was introduced as a member of 
the research staff. There was no evidence of a complete code-switch, but the use of 
standard variants of the variables increased significantly among the local speakers 
after the “stranger” had entered the discussion. However, the increase of standard 
forms was strongest in sentences following immediately after the “stranger” had been 
talking. A number of the speakers who participated in the group discussions were 
also interviewed in Thelander’s study. In the interview situation the use of stand- 
ard forms was considerably higher than in the group discussions, and the difference 
between the discussion and the interview situation was larger than between group 
discussion with or without “stranger”. When the speakers were explicitly told that 
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the purpose was to record the local colloquial language there was less difference in 
speech style in the different situations. 

To avoid influencing the speaking style of the participants in the SweDia project 
the interviewers tried to talk as little as possible and to give feedback with facial 
expressions and body language rather than using verbal feedback (Aasa et al., 2000). 
The participants were also told explicitly that the purpose of the interview was to 
record the local dialect. For some speakers it was more difficult than for others to 
keep talking the local dialect when the interviewers spoke another variety. When 
eliciting the controlled data, the participants sometimes had to be reminded to give 
the local forms (Sw. bondska) instead of using standard variants. 


4.2 Vowel data 


The vowel data for this thesis come from the controlled part of SweDia database 
focusing on the sound system. The purpose of this part of the database was to make 
phonetic and phonological analyses of the vowel systems of the dialects possible. As 
described in § 2.3.3 the Swedish dialects vary a lot with respect to the vowel systems, 
and the phoneme systems of many present day dialects are unknown. Therefore, the 
word list for eliciting vowels was put together so that it would not only cover the 
Standard Swedish vowel system, but also reflect some Proto-Nordic features that are 
known to be preserved in some dialects. 

A list of approximately 30 different words was put together for eliciting the vowel 
data. The list was, however, not constant across all sites but varied to some extent. 
For some of the most divergent dialects the word list turned out to be unsuitable, 
because many of the words in the list were not used in the local vernacular. This was 
the case for the sites Munsala, Orsa and Alvdalen, for which separate word lists were 
created. For the same reason, a few words in the original list had to be replaced for 
all or some of the speakers at some other sites. In order to keep the phonetic context 
of the vowels as stable as possible, words were chosen where the target vowels are 
surrounded by coronal consonants. Only words with an /r/ following the vowel were 
an exception from this rule, since some varieties of Swedish have a dorsal /r/ and 
not an apical /r/ like Standard Swedish. Including vowels in pre-/r/ context was 
still important because some of the Swedish vowels have allophonic variants that 
occur only before /r/. 

The interviewers had prepared short questions! that would make the dialect 
speakers come up with a certain word. Once a speaker had guessed the right word, 
the word was repeated 3-5 times in isolation. 

The word list data were manually segmented and transcribed within the SweDia 
project and partly in the follow-up project SweDat. The segmentation and transcrip- 
tion was mainly done by student assistants working within the project. For each 
word, only the target phoneme was segmented and transcribed. The transcription is 


lFor example, “Vad anvdnder fiskaren fér att fanga fisk?” ‘What does the fisherman use for 
catching fish?’, answer: “ndt” ‘net’. 
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a rough phonetic transcription. The aim of the SweDia project was to gather data 
and provide researchers with the data for experimental and quantitative research 
purposes. The transcriptions were mainly intended to offer an overview of the data 
as well as to serve as a search tool. By not transcribing the data in fine phonetic detail 
variation due to different transcribers was reduced. One example of the roughness of 
the transcriptions is that diphthongs are usually not transcribed, but indicated with 
the symbol of the most nearby monophthong and an additional label “dift”. For the 
present thesis, vowel segments were analyzed acoustically. The transcriptions in the 
database were only used for identifying point vowels (see § 5.1.3). 


4.2.1 Selected vowels 


For this thesis a selection of the vowels in the SweDia word lists was made. Table 4.1 
displays the 19 vowels selected: twelve long vowels and seven short vowels. Only 
words that were used for eliciting vowel phonemes at most of the locations in the 
database were chosen. Moreover, it turned out that some of the words in the original 
word lists were problematic for eliciting dialect data. The selected words include all 
the Standard Swedish long vowel phonemes and the allophones of /e:/ and /o:/. Of 
the Swedish short vowel phonemes four are missing: /¢/, /e,/ /u/ and /¢/. However, 
the pre-/r/ allophones of /e/ and // (that is, [ze] and [ce]) are represented. 

For some reason, the /o:/ vowel was elicited with the word lat in the southern 
parts of the language area (administrated by the university of Lund) and with /ds in 
the central and northern parts (administrated by the universities of Stockholm and 
Umea). Even though, as a rule, only vowels elicited with the same word all over the 
language area were used for this study, an exception was made for /o:/, so that the 
complete set of Swedish long vowels would be represented. 

Standard Swedish /g:/ is represented by two different words, lds and sét, in the 
selected data set. The reason for this is that the vowel in 16s represents the Proto- 
Nordic diphthong /au/, while the vowel in sét was originally a monophthong. Some 
dialects have preserved two different phonemes in these two words. 


4.2.2 Missing vowels 


The reason that some of the Swedish short vowels are missing in the selected data set 
is that they had not been consistently elicited at all sites for the database. Sometimes 
different words were used at different sites for eliciting the vowels. However, two 
words that have the same vowel phoneme in Standard Swedish do not always have 
the same phoneme in all dialects. For this reason a decision was made to only 
use vowels elicited with the same word for comparison across dialects. The only 
exception to this rule was Standard Swedish /o:/, elicited with lds and ldt (see 
previous section). 

For eliciting the Standard Swedish phoneme /¢/ the word bldtt ‘wet’ was used. 
This turned out to be problematic, because some of the fieldworkers asked for the 
adjective biétt (neuter form) while others asked for the verb form blétt (supine), 
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Table 4.1. The words used for eliciting the vowels that comprise the data set for 
the current study. 


Swedish Standard Proto- word class, English 

word Swedish Nordic form translation 
vowel diphthong? 

dis /i:/ no noun sing. haze 

disk /1/ no noun sing. counter, dishes, disk 

typ /y:/ no noun sing. type, jerk 

flytta /y/ no verb inf. move 

leta /e:/ yes verb inf. seek, look for 

lett /e/ yes verb sup. lead 

lus /a:/ no noun sing. louse 

nat Je: no noun sing. net 

lar [aes] no verb pres. teach 

sark [a] no noun sing. nightgown 

sot /o:/ no adj. sing. sweet 

los /@: yes adj. sing. loose 

dér [oer] yes verb pres. die 

dérr [oe] no noun sing. door 

lat /a:/ no adj. sing. lazy 

lass /a/ no noun sing. load 

las /lat /o:/ no noun sing. lock /tune 

lott /o/ no noun sing. lott, share 

sot /u:/ no noun sing. soot 


which are homophones in Standard Swedish. The vowels in the adjective and the 
verb, however, do not have the same historical origin. The vowel in the adjective 
blétt originates from the Proto-Nordic diphthong /au/, while the vowel in the verb 
form originates form Proto-Nordic /eu/. A number of Swedish dialects have pre- 
served the Proto-Nordic diphthongs, and thus, also reflexions of /au/ and /eu/, as 
separate phonemes. In the database the words are not tagged for word class, but 
only the Standard Swedish orthographic form is given in addition to the phonetic 
transcription of the vowel segment. Comparing the reflexion of /au/ in one dialect 
with the reflexion of /eu/ in another dialect would show differences that do not 
have a linguistic basis. Therefore, blétt was not included in the analysis, so that the 
phoneme /@/ is only represented by its allophone [ce] in the word dérr. 

For eliciting /e/ the word ludd ‘fluff, fuzz’ was used in the SweDia project. This 
was an unlucky choice, since this word turned out to be unknown to many of the 
dialect speakers. In many dialects the words Jo and lugg are used instead of ludd. 
In lugg the vowel phoneme is the same as in ludd, only the consonant context is 
different. The word lo, however, has a different phoneme (a long vowel). Therefore, 
the data concerning /e/ is incomplete in the database. This is a shortcoming, since 
/o/ is an important dialect marker: in the spoken language around Stockholm and 
Uppsala many speakers show a merger of /e/ and /¢/, while, for example, in parts 
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of the Finland-Swedish dialect area /o/ is lacking as a separate phoneme with the 
pronunciation being identical to /u/. 

For the phoneme /e/ the word ldatt ‘easy’ was used at some sites and ldsk ‘soft 
drink’ at other sites. While /dit originates from Proto-Nordic, ldsk is a relatively 
recent formation to the verb ldéska which is a Low German loanword in Swedish. 
The @ in the two words does not have the same origin and the two words might have 
different phonemes in some dialects. Therefore the vowel was not included in the 
analysis. 

The phoneme /u/ was not included in the word lists for all of the locations, and 
could therefore not be included in this study. 

It is regrettable that a few important vowel phonemes have not been consistently 
elicited for the SweDia database. Still, the SweDia data is a unique collection of 
systematically elicited data of modern Swedish dialects from the whole Swedish 
language area. Even though a few vowel phonemes are missing in the analysis in the 
present thesis, the data should be able to give a good picture of how the Swedish 
dialects relate to each other with respect to vowel pronunciation. All Standard 
Swedish long vowel phonemes are included in the data set, and as mentioned in § 2.3 
the geographic variation is more prominent in Swedish long vowels than in Swedish 
short vowels. 


4.3. Speakers 


The total number of dialect speakers analyzed in this thesis is 1,170, recorded at 98 
different sites. The sites are displayed in Figure 4.1. In addition, twelve speakers 
of Standard Swedish were included. These speakers were recorded in the SweDia 
project, too, and had been perceived as good representatives of Standard Swedish 
pronunciation. The speakers of Standard Swedish grew up in the greater Stockholm 
area and were all either professional linguists working at a Swedish university or 
students of a linguistic subject at the time of the recording. 

In the SweDia project the aim was to record twelve speakers at each site: three 
older women, three older men, three younger women and three younger men. How- 
ever, at some sites more than twelve speakers were recorded and at some sites the 
fieldworkers did not manage to find three speakers of each speaker group for a re- 
cording. Therefore, the number of speakers varies somewhat across sites and across 
speaker groups. In addition, not all words in Table 4.1 were recorded by all speakers 
at all sites. A decision was made to include only speakers who had recorded at least 
13 out of the 19 vowels. The average number of speakers per site is twelve, but the 
number varies between eight and fourteen. The number of speakers per site included 
in this thesis is shown in Appendix A. 

For the various analyses in this thesis average values per vowel were computed 
for groups of speakers. Three different groupings were made: 
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Figure 4.1. The 98 sites in Sweden and Finland where the dialect data were 
recorded. The four biggest cities in the area are included as reference points in the 


map. 
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e one group per site 
e two groups per site: older and younger speakers 


e four groups per site: older women, older men, younger women, younger men 


When the number of vowels recorded by a group was less than fifteen, the group was 
not included in any analyses. The total number of speakers in each group and the 
number of older and younger speakers and men and women in each group is listed 
in Appendix A. 

Some of the analyses presented in Chapters 6 and 7 work with missing data in the 
data matrix, while others do not. Factor analysis (§ 6.3) is sensitive to missing data, 
so only objects with data for all 19 vowels were included. Multidimensional scaling 
(Chapter 7), on the other hand, is based on average vowel distances, which can be 
calculated for a smaller number of vowels without biasing the results. Therefore, a 
larger number of speakers are included in the multidimensional scaling than in the 
factor analysis. Groups without the full number of vowels are indicated by footnotes 
in Appendix A. 

The average birth year for the older speakers is 1933 and for the younger speakers 
1973. When the recordings were made, the average age of the older speakers was 
66 and the average age of the younger speakers 26. As the histograms in Figure 4.2 
show, the age range is larger for the older speakers than for the younger speakers. 
The older speakers were born between 1911 and 1957, while the younger speakers 
were born between 1959 and 1982. 

By including speakers from two age categories in the SweDia database one pur- 
pose was to make studies of language change in apparent time possible. The age 
range of the younger speakers was deliberately chosen so that the group would not 
include teenagers, but somewhat older speakers. While the language of teenagers can 
include features that are dropped when the speakers get older, speakers in their 20s 
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Figure 4.2. Histograms of the birth years of the speakers. 
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or early 30s who still lived in the municipality where they grew up were considered 
to be representative for the local dialect (Eriksson, 2004b). 

The speakers were all born in the area where they were recorded and had lived 
there most of their lives. An additional requirement for the younger speakers was 
that their parents were speakers of the local vernacular. For the older speakers this 
was not a requirement, but the parents of most of the older speakers were at least 
from the same larger region. 


Chapter 5 


Acoustic measures of vowel 
quality 


As described in § 2.4.3 acoustic measures of vowel pronunciation are generally influ- 
enced by the anatomy /physiology of the speaker. The largest differences related to 
anatomy /physiology can be found between men and women and children, but also 
within these three groups speaker dependent variation is found. This is a problem 
in dialectological and sociolinguistic research, since researchers are mainly interested 
in the socially and geographically conditioned variation and would like to disregard 
variation related to anatomy. 

As mentioned in § 2.4.4 a large number of methods has been developed that 
attempt to normalize for the speaker-specific variation in formant measurements. 
However, these normalization procedures are successful only to some extent. The 
main finding is that normalization procedures can and should be applied only to data 
sets that are fully phonologically comparable, that is, have the same mean value and 
standard deviation (Disner, 1980; Adank, 2003). This means that normalization 
procedures can only be used when all speakers have the same phoneme system or at 
least share the same stable point vowels. 

In some variationist studies the problem of speaker variability is solved by aver- 
aging over a number of speakers for each variety. This can be done if the groups are 
large enough and if the share of men and women is equal across all groups. If not, 
anatomical/physiological differences will bias the results. 

In the Swedish language area there are dialects with very deviant vowel systems 
(see § 2.3.3). Moreover, all the vowels in the data set for the current thesis were not 
recorded by all speakers (see § 4.3). This makes the use of standard normalization 
procedures impossible. Because of the differing number of men and women per vari- 
ety and per vowel, averaging over all speakers per variety would also not normalize 
for the variation related to speaker anatomy / physiology. 

Principal component analysis (PCA) of band-pass filtered vowel spectra has been 
shown by Jacobi (2009) to be a suitable method for large-scale language variation 
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studies (see § 2.4.2). One advantage of this method, introduced by Plomp, Pols, 
& Van de Geer (1967), is that it can be fully automated without leading to errors 
of the kind made by formant tracking algorithms. Because the method does not 
require manual correction of the results it can easily be applied to large amounts of 
data. Still, this method is also sensitive to speaker variability and to some extent 
to the amount of noise in the recordings (Jacobi, 2009, 59-63). Jacobi (2009, 63- 
66) related the vowel pronunciations of each speaker to his or her point vowels and 
could, thus, reduce the speaker and recording specific variation. This could be done 
because the Dutch point vowels—/a/, /i/, /u/—are assumed to be stable across the 
whole language area. 

PCA of band-pass filtered vowel spectra was chosen as a measure of vowel qual- 
ity for the present thesis. Because the data comprises nearly 1,200 speakers it 
was essential to choose a method which can be automated to a higher extent than 
formant measurements. Since the Swedish point vowels are not stable across all dia- 
lects, Swedish does not offer the opportunity to use point vowels to reduce speaker- 
dependent variation. However, using PCA on Bark-filtered vowel spectra offers an 
opportunity to eliminate the largest source of speaker dependent variation: the 
one caused by anatomical/physiological differences between men and women. The 
method is described in § 5.1. In § 5.2 principal components of Bark-filtered spec- 
tra are compared with formants, and the principal components are interpreted in 
relation to formant frequencies. 


5.1 Principal component analysis of Bark-filtered 
vowel spectra 


The method chosen for assessing vowel quality for this thesis comprises two steps: 
Bark filtering and principal component analysis (PCA). The method is described in 
the following sections, and in § 5.1.7 a short summary of all the steps of the analysis 
is given. 


5.1.1 Bark filtering 


Using the Praat! software vowel spectra were filtered with Bark? filters up to 18 
Bark with a window length of 13 ms. Each pass band had a bandwidth of one Bark 
and adjacent filters overlapped at —3 dB (Jacobi, 2009, Fig. 3.3). Following Jacobi 
(2009, 55) 18 Bark was chosen as the highest frequency. The frequency range up to 
ca. 18 Bark is where the first three formants of vowels are found. Higher frequencies 
are not used by listeners for identifying vowels but mainly show speaker-specific 
variation. Table 5.1 shows the mid-frequencies in Hertz of the 18 Bark filters. 


'Praat: phonetic software. Version 5.1. By P. Boersma and D. Weenink, University of Amster- 
dam. <http://www.fon.hum.uva.nl/praat /> 

? A number of different algorithms have been proposed for modeling the Bark scale. Praat uses 
the conversion formula 7 x In(Hz/650 + ,/1+ (Hz/650)2). See also the description of the Bark 
scale in § 2.4.2 (p. 28). 
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Table 5.1. Mid-frequencies of the 18 Bark filters in Hertz 


Bark Hz Bark Hz Bark Hz 

1 93 7 764 13 2031 
2 188 8 915 14 2357 
3 287 9 1086 15 2732 
4 392 10 1278 16 3163 
5 505 11 1497 17 3657 
6 628 12 1746 18 4228 


Measurements were made at nine points in time within every vowel segment 

starting at 25% of the total vowel duration and ending at 75% of the vowel duration, 
: 45 6 7 8 (_ 9 10 11 12 
that is, at 76, 76.4670 16 (=Center), 7g, 7g. 7q and 4g. 

The Bark-filtered spectra were level-normalized. Normalization was done for 


every 13 ms sample so that the levels add up to 80 dB. 


5.1.2 Principal component analysis 


The filter bank representation of vowels described in the previous section can be 
reduced to articulatory meaningful components by means of principal component 
analysis (PCA) (Jacobi, 2009, 42). A PCA of the Bark-filtered vowel spectra was 
carried out with the statistical software package SPSS°. 

PCA is a data reduction technique that aims at reducing a larger number of 
variables into a smaller set of components. It enables the researcher to identify 
which variables in a data set show similar patterns of variation and whether the 
variables can be divided into relatively independent subsets. Based on a variance- 
covariance matrix or a standardized correlation matrix of the observed variables, 
variables that correlate with each other are combined into components, so that the 
total amount of data can be reduced. The first principal component (PC) explains 
as much as possible of the total variance in the data set, the second PC as much as 
possible of the variance still left, etc. 

The analysis produces a set of loadings and a set of scores for each extracted 
component. Loadings can be interpreted as correlations between the original vari- 
ables and the components and can be used to calculate scores for each object based 
on the original variables. The scores can be interpreted as such, or they can be used 
as input to further analyses replacing the larger number of original variables for each 
object and thus reducing the data set. Thorough descriptions of PCA can be found 
in statistical handbooks, for example Field (2005) or Tabachnik & Fidell (2007). 


5.1.3 Computing loadings based on point vowels 


The PCA of the acoustic data in this thesis was based on a variance-covariance 
matrix. Following Jacobi (2009) the loadings of the PCA were calculated using only 


3SPSS version 16.0 for Windows. SPSS Inc. 
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Table 5.2. Sample of the four point vowels (measured at the temporal mid-point) of 
a number of speakers used in the analysis phase of the PCA for computing loadings. 
bf = Bark filter 


objects variables 
bf. bfs nt bf, 
speaker vowel (dB) (dB) (dB) 
speaker, a 58.60 61.10 a 46.37 
a 65.57 65.25 ais 64.85 
i 67.80 72.10 o, 68.49 
u 63.09 67.44 ak 56.73 
speakery a 61.71 62.41 sy 41.85 
ze 71.94 73.89 ty 58.98 
i 69.23 72.27 a 56.01 
u 67.96 70.28 is 37.85 
speaker, a 59.97 61.52 i 52.51 
a 54.37 53.34 os 57.02 
i 65.15 68.73 re 59.00 
u 65.16 66.34 art 38.82 


Table 5.3. Sample of data to be reduced by the PCA. Each of the 19 vowels of 
every speaker is represented by the intensities (in dB) in a number of Bark filters 
(bf) measured at nine sampling points within the vowel segments. 


objects variables 
speaker vowel no; bf bfs re bfn 
P point (dB) (dB) re (dB) 
speaker, dis, 66.69 71.96 fee 64.31 
diss 66.79 71.82 wie 63.48 
disg 72.62 75.57 os 68.85 
disk, 73.24 75.72 Lea 64.99 
diskg 72.87 75.17 es 65.57 
typ1 67.26 73.04 is 63.85 
typo 74.03 75.45 shy 62.56 
speaker, typ1 72.71 75.28 if 52.36 
typ... 


typg 75.51 75.83 a 50.67 
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point vowels (that is, vowels with the most extreme values for the two first formants). 
This is done in order to weigh all articulatory dimensions equally and to make sure 
that the PCA accounts for all possible variation in the vowel space. Table 5.2 shows 
a sample of the data used as input in this initial phase of the PCA. The point vowels 
of the speakers serve as objects in the analysis, and the intensities of the Bark filters 
as variables. 

For the Swedish data the vowels transcribed as [i:], [ze:], [a:]/[a:] and [u:] in the 
database were chosen for computing the loadings in the initial phase of the PCA. 
Two variants were allowed for /a:/, since the pronunciation in Standard Swedish 
is [a:], but we did not want to exclude that even more open vowels were added to 
the analysis. Standard Swedish [ze:] has a more extreme F1 than Swedish [a:] (see 
Table 5.11, p. 83) and was therefore also used as a point vowel. 

Some dialects show such strong deviation from Standard Swedish that not all 
point vowels could be found in the set of 19 words. For example, the South Swedish 
dialects have strongly diphthongized pronunciations of the long close vowels /i:/ 
and /u:/. Because point vowels were not available for all of the speakers, a subset 
of speakers was used for calculating the loadings. All point vowels were available 
for 230 women. The number of men with all point vowels was a bit larger, so 
out of these speakers 230 men were picked randomly, in order to include as many 
men as women in the analysis. Using 230 men and 230 women for computing the 
loadings of the PCA, means that a great number vowel spaces with different speaker- 
dependent sizes are included. Based on this, scores can be calculated also for the 
vowels of the speakers that were not included in the initial subset. Only the central 
measurement point was used for calculating the loadings. For each of the 230 men 
and 230 women average levels of the Bark filters of all occurrences of the point vowels 
were calculated.* 

Figure 5.1 shows the mean intensities per Bark filter for the point vowels of the 
230 men and 230 women. In these figures the characteristic spectra of the point 
vowels can be identified. The [i:] vowel has a very low first formant resulting in an 
intensity peak in the lowest frequency area, approximately at Bark filters 2 and 3. 
The second formant of [i:] is very high, at approximately 13-15 Bark. The open 
vowel [ze:] has a high F1 at 6-7 Bark. The spectral peaks resulting from F2 and F3 
of [ze:] can also be seen clearly at 11-12 respectively 14-15 Bark. F1 and F2 of the 
vowel [a:]/[a:] are very close to each other giving a broad peak at 5-9 Bark, while 
the first two formants of [u:] are very close to each other as well, but at a much 
lower frequency (3-7 Bark). The lines of men and women are very similar in these 
graphs but the peaks of the spectra are consistently at a somewhat higher frequency 
for women than men. This can be illustrated by shifting the frequency scale with 
1 Bark, which is done in Figure 5.2. In these figures the lines of men and women 
follow each other almost perfectly. 

The biggest difference between the spectra of men and women in Figure 5.2 
seems to be that the women in general have less intensity at the highest frequencies, 


4 Leverage = 10logio[(+) De log—*(L;/10)] 
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Figure 5.1. Average decibel levels of the point vowels in the band-pass filtered 
frequency regions. 


which can be most clearly seen in the case of [i:]. The vocal folds of men and women 
produce a different kind of pulse. The duration of the open portion of a fundamental 
period is relatively longer in women’s voices, and due to the relatively longer pulse by 
women than by men the higher harmonics are weaker in women’s speech (Rietveld 
& Van Heuven, 2009, 341). This leads to a steeper spectral tilt for women than 
for men. Measured with four rather broad frequency bands Sluijter & Van Heuven 
(1996) found that at 0-500 Hz female voice have 1 dB greater intensity than male 
voices, while at 500-4000 Hz the intensity is 2-3 dB weaker in female voices than in 
male voices. The greater intensity of men than of women at the highest frequencies 
in Figure 5.2 is therefore attested also in previous studies. 

Van Nierop, Pols, & Plomp (1973) showed that the systematic differences in the 
spectra of vowels produced by men and women lead to similar results when carrying 
out separate PCAs for the two groups. The main latent variables related to vowel 
pronunciation are present in the acoustic data of both men and women, but the 
information is found at a higher frequency in the female data. This fact was used 
in the current thesis for normalizing for the systematic differences in the acoustic 
data of men and women related to anatomy /physiology: separate PCAs were carried 
out on the male data and the female data. The scores of the two separate PCAs 
can subsequently be used as comparable measures of vowel quality and can be used 
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Figure 5.2. Average decibel levels of the point vowels in the band-pass filtered 
frequency regions. Scales of men and women shifted with 1 Bark. 


in analyses of dialectal variation. Bark filters 2-17 were used as variables in the 
PCA of the male data and Bark filters 3-18 in the female analysis. Using a set 
of Bark filters shifted with 1 Bark as variables in the two analyses means that the 
frequency area analyzed in the two PCAs contains the same information related to 
vowel pronunciation. Both analyses comprised 920 objects in the initial phase (4 
vowels x 230 speakers). In § 5.1.5 the effect of analyzing the female and male data 
separately is examined more closely. 

Jacobi (2009) carried out one PCA, which included both female and male speak- 
ers. She combined the two lowest Bark filters (1 and 2) in order to “prevent strong 
variance caused by the speakers’ varying fundamental frequency”. The mean funda- 
mental frequency of Swedish speakers has been shown to be 188 Hz for women and 
116 Hz for men (Pegoraro-Krook, 1988). Thus, the fundamental frequency is mainly 
represented in the first Bark filter for men and in the second for women. These Bark 
filters are not important for identifying vowels. When analyzing data from men and 
women separately the lowest Bark filters which represent the fundamental frequency 
can be left out. 

After computing the loadings of a PCA with a subset of objects, scores can be 
computed for the full data set. Also the second phase of the PCA was carried out 
separately for men and women. Table 5.3 shows a sample of the full data set used 
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Table 5.4. Result of data reduction by speaker vowel point PC1 PC2 
PCA. The original variables (that is, the ellen Hie 0.99 L75 
intensities in a number of Bark filters) dis 120 161 
have been reduced to scores on two PCs. 2 : : 


The objects of the analysis are the 19 dis... ie Sy 
lee all k Re a disg -2.15 1.62 
sivas a. eee ers measured at nine dee 1.56 76 
Sees aes disk... c et 
diskg -1.83 1.58 

typi -1.02 1.79 

typ... os ae 

typo -1.82 1.70 

speaker, typi -102 0.98 

typ... és: i 

typo -1.31 0.96 


as input in the second phase of the PCA. The objects of the full data set are the 19 
vowels (see § 4.2.1) of all (male/female) speakers measured at nine sampling points. 
Based on the loadings from the initial phase of each PCA scores for all vowels of 
all male/female speakers were computed. Table 5.4 displays a sample of the scores 
which are the output of the PCA. In the data set that has been reduced by means of 
PCA each object is described by scores on two extracted components (PCs) instead 
of by the original variables. These scores are the measures of vowel quality. 


5.1.4 Rotating the solution 


Because the first PC explains as much as possible of the total variance in the data set, 
most variables will have relatively high loadings on the first PC and smaller loadings 
on the remaining components. This can make interpretation of components difficult, 
since the original variables are not necessarily unambiguously connected to only one 
of the extracted components. This characteristic can be changed by using rotation 
techniques. 

The most commonly used rotation technique is varimax, which rotates the axes 
so that the variables correlate maximally with only one component. Figure 5.3 
shows the loadings of the two first PCs based on the male and female point vowels 
in unrotated solutions. The plots show clouds of variables that are not centered 
along any of the axes, but the variables correlate with both the first and the second 
PC. These are typical cases where rotation could lead to more easily interpretable 
solutions. 

Figure 5.4 shows the loadings of PCAs of the same data with varimax rotation. 
Contrary to Figure 5.3, the clouds of variables are now centered along the x- and 
y-axes and only a few variables correlate highly with both PCs. Interpretation of 
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Figure 5.3. Loadings on the two first PCs (men to the left, women to the right). 
The numbers indicate the number of the Bark filters (men 2-17 Bark, women 3-18 


Bark). 
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Figure 5.4. Loadings on two PCs extracted with varimax rotation (men to the 
left, women to the right). The numbers indicate the number of the Bark filters (men 
2-17 Bark, women 3-18 Bark). 
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Figure 5.5. Loadings of the male and female PCAs (2 PCs extracted without 
rotation) plotted against the frequency scale. 


the components becomes easier because it is evident which variables correlate with 
each of the PCs.° 

Varimax also tends to equalize the proportion of variance explained by the com- 
ponents, by taking variance from the first component and distributing it among the 
later ones (Tabachnik & Fidell, 2007, 638). Because of this all components will be 
affected by the number of components extracted when using varimax. 

One further difference between the two pairs of plots in Figures 5.3 and 5.4 is that 
the configurations of the male and female analyses are mirrored around the x-axis in 
the unrotated solutions in Figure 5.3; the highest band-pass filters are found in the 
fourth quadrant in the male solution but in the first quadrant in the female solution, 
while the middle frequencies (Bark filters 6-10) are in the first quadrant in the male 
solution and in the fourth in the female solution. After applying varimax rotation 
the solutions are much more similar for men and women (Figure 5.4). Van Nierop 
et al. (1973) found that solutions based on male data and female data are comparable 
and only need to be rotated in order to overlap each other. Varimax seems to offer 
a standard solution for rotation so that the configurations of male and female data 
have the same orientation. Therefore, varimax was chosen as rotation technique for 
extracting the PCs for this thesis. 

The difference between non-rotated solutions and solutions using varimax can be 
seen even more clearly when plotting the loadings along the frequency scale. Fig- 
ure 5.5 shows the loadings of the two first PCs of the male and female data extracted 
without rotation, while Figure 5.6 shows the loadings of the varimax solutions. While 


5For an interpretation of the PCs, see § 5.2.3. See also § 5.2.1, where both the unrotated and 
the rotated solutions are compared to formant measurements. 
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Figure 5.6. Loadings of the male and female PCAs (2 PCs extracted with varimax 
rotation) plotted against the frequency scale. 


Bark filter (men) 


T T T T T T T 
° 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 


wo 
24 
D 
ee 
Bs 
2 
ire} —7- men PC1 
al —=— men PC2 
-@- women PC1 
-™- women PC2 
° 
3h 


T T T T T T T T 
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 
Bark filter (women) 


Figure 5.7. Loadings of the male and female PCAs (2 PCs extracted with varimax 
rotation) plotted against a shifted frequency scale. 
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Figure 5.8. Scores of the point vowels [i:], [ge:], [a:]/[a:] and [u:] in the PC2/PC1 
plane with varimax rotation. The plots of the male data (left) and female data 
(right) are similar to each other and resemble the IPA vowel quadrilateral. One 
standard deviation ellipses. 


Figure 5.5 displays quite different curves for men and women, the curves of the com- 
ponents in Figure 5.6 look very similar for both sexes. In the varimax solutions 
the loadings of men and women only seem to be placed differently on the frequency 
scale. 

In Figure 5.7 the frequency scale has been shifted so that instead of using the 
same scale for men and women, the loadings of the male analysis are plotted on a 
scale ranging from 2 to 17 Bark, while the loadings of the female analysis are plotted 
on a scale from 3 to 18 Bark (which corresponds to the contiguous Bark filters used 
as variables in the two analyses). This figure shows that the curves are indeed almost 
identical for men and women after applying varimax, which suggests that the same 
information is extracted in both PCAs. Because of the anatomical/physiological 
differences between men and women this information can be found on average 1 Bark 
higher in the female data. 

The effect of applying varimax can also be visualized by plotting the scores 
assigned to the point vowels by the PCA. The scores are the result of the data 
reduction, and can be used as measures of vowel quality for each segment. The scores 
were estimated with the regression method, which produces scores with a mean of 0 
and a standard deviation of 1 for each PC (Tabachnik & Fidell, 2007, 650). Figure 5.8 
shows the scores of the male and female point vowels in the varimax solution.® Just 


SEllipses are drawn by applying PCA once more, but this time separately for each vowel with 
the acoustic PCs as variables. The major and minor axes of the ellipses are the two first PCs 
of the data and the longest axis, hence, shows the direction that explains most of the variance 
(Harrington, 2010, Ch. 6). Assuming that the data is normally distributed an ellipse with a radius 
of 1 standard deviation covers 39.4% of the data points, while an ellipse with a radius of 2 standard 
deviations covers 86.5% of the data points. All ellipse plots in this thesis were drawn using the 
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Figure 5.9. Scores of the point vowels [i:], [ge:], [a:]/[a:] and [u:] in the PC2/PC1 
plane without applying any rotation technique. The vowel plots of the male data 
(left) and female data (right) show a skewed position compared to the plots in 
Figure 5.8 and are mirrored around the y-axis. One standard deviation ellipses. 


as in a formant plot (see, for example, Figure 2.4, p. 25, and Figure 2.5, p. 30), the 
scores are plotted with the first component on the y-axis and the second component 
on the x-axis with both scales reversed, which results in configurations that resemble 
the IPA vowel quadrilateral. Figure 5.9, on the other hand, displays the scores of 
the point vowels of the unrotated PCAs. These plots do not show the familiar vowel 
quadrilateral with backness along the x-axis and height along the y-axis, but the 
vowel spaces have a rotated position in the coordinate system. Moreover, the plots 
of the male and female point vowels are mirrored around the y-axis. 


5.1.5 The result of separate PCAs of men and women 


The result of analyzing the vowel pronunciations by men and women separately can 
be examined by comparing the vowel scores. The plot to the left in Figure 5.10 shows 
the scores of the point vowels after running separate PCAs for men and women with 
varimax rotation. The ellipses, which have a radius of one standard deviation, fit 
each other almost perfectly.® 

For comparison the plot to the right in Figure 5.10 shows the vowel scores of a 
PCA where men and women were included in the same analysis. Bark filters 2-18 
were analyzed for all speakers, and the analysis included 1,840 objects (4 vowels x 
460 speakers). Because the sexes were analyzed together the anatomical differences 
between men and women led to systematic differences in the scores of the PCA. 
Vowels produced by women were assigned systematically higher scores on PC1 than 


eplot() function of the Emu library in the software package R. 
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Figure 5.10. Scores of the point vowels [i:], [ae:], [a:]/[a:] and [u:] (2 PCs extracted 
with varimax rotation). The plot to the left is based on separate PCAs for men and 
women, while the plot to the right is based on one PCA for speakers of both sexes. 
Separate one standard deviation ellipses are drawn for men (m) and women (w). 
The plot to the right fails to normalize vowel quality with respect to sex. 


vowels produced by men. Separate analyses for men and women seem to normalize 
for the anatomical differences. 

In order to test to what extent separate PCAs for men and women actually 
normalize for the anatomical differences, t-tests were carried out for all four point 
vowels. This was done both for the scores of the separate PCAs of men and women 
and for the PCA that included both men and women. The t-values and the signi- 
ficance levels of the t-tests are displayed in Table 5.5. The results show that when 
carrying out separate PCAs for men and women there are no significant differences 
between the scores of men and women except for on PC1 of [u:]. When running 
only one PCA where both men and women are included all vowels show significant 
differences on PC1, and [i:] and [u:] also on PC2. The separate analyses of men and 
women clearly leads to fewer differences between the sexes in the vowel scores. 

Given that the separate analyses of vowels produced by men and women show 
less significant differences between the sexes than one single analysis, we expect that 
also the total variance caused by speaker-specific variation is reduced by the separate 
analyses. A visual impression is given by the plots in Figure 5.11, which show two 
standard deviation ellipses of the point vowels of the 460 speakers (230 men and 230 
women) in the separate analyses and in the joined analysis. The size of the ellipses 
suggests that a large amount of individual variation is still left after both analyses. 
However, the variation is reduced to some extent by analyzing men and women 
separately, as can most clearly be seen in the cases of [i:] and [u:]; the ellipses of 
these two vowels show overlap when men and women were analyzed together (right 
plot in Figure 5.11) but no overlap when the PCA was carried out separately for the 
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Figure 5.11. Two standard deviation ellipses of the scores of the point vowels [i:], 
[eer], [a:]/[a:] and [u:] (2 components extracted with varimax rotation). In the graph 
to the left, based on separate PCAs for men and women, the ellipses are smaller and 
show less overlap than in the graph to the right based on one PCA for speakers of 
both sexes. Processing male and female voices separately, hence, reduces variability. 


Table 5.5. T-tests comparing the means of female and male speakers on each point 

vowel and on both PCs (df = 458). 

separate PCAs for 
men and women 


one PCAs for both 
men and women 


vowel value PC1 PC2 PC1 PC2 

ir] t —1.3 —0.4 —10.7 8.3 
sign 0.191 0.671 <0.001 <0.001 

zr] t —1.1 0.5 —9.0 —0.6 
sign 0.254 0.608 <0.001 0.545 

a:]/[ar] t 0.1 —0.9 —17.7 —1.0 
sign 0.895 0.380 <0.001 0.318 

uw] t 2.1 0.8 —9.7 3.3 
sign 0.033 0.435 <0.001 0.001 
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two speaker groups (left plot in Figure 5.11). 

The analyses above support the choice of carrying out separate PCAs for men 
and women when reducing the filter bank representation of vowels to articulatory 
meaningful components. In order to further investigate which sources account for 
the variance in the PCs multivariate analyses were carried out. The multivariate 
analyses of PCs as well as of formant frequencies are presented in § 5.2.2. 


5.1.6 Effect of noise 


Jacobi (2009, 59-63) found that the positioning of the vowel spaces of different 
speakers in the PC2/PC1 plane was affected by the signal-to-noise ratios in the 
recordings: the lower the signal-to-noise ratio (that is, the more noise), the smaller 
the measured vowel space in the PC plane and the higher the PC scores. Jacobi’s 
data comprised recordings in extremely varying situations from interviews in silent 
environments to private conversations and broadcast recordings with music in the 
background. 

The present data set is much more homogeneous when it comes to the recording 
situations than Jacobi’s data. In the SweDia project attention was payed to all 
recordings being made in as similar circumstances as possible. The recordings were 
generally made in the speakers homes, and quiet rooms with as little reverberation 
as possible were chosen to ensure good recording quality (see § 4.1). Because no 
drastically varying signal-to-noise ratios were expected in the present data the effect 
of noise in the dialect recordings was not tested. 

The speakers representing Standard Swedish, however, were not recorded in their 
homes, but in a studio. A studio is more silent than any home environment, which 
means that a higher signal-to-noise ratio could be expected for the speakers of Stand- 
ard Swedish than for the dialect speakers. The speakers of Standard Swedish were 
not included in the initial subset used for calculating the loadings of the PCAs, but 
only included in the full data set for which scores were calculated. As mentioned 
earlier the regression technique, used for calculating the scores in the PCA, produces 
scores with a mean of 0 and standard deviation of 1 for each PC. To test whether the 
standard speakers had systematically lower scores than the dialect speakers, with a 
presumably lower signal-to-noise ratio, t-tests were carried out. The average scores 
on both PCs of the standard speakers’ four point vowels ({i:], [ger], [a:]/[az] and [u:]) 
were tested against the point vowels of the 460 dialect speakers of the initial subset. 
For the dialect speakers the mean of both PCs was < 0.001 as expected. For the 
speakers of Standard Swedish the mean of PC1 was —0.339 and the mean of PC2 
—0.273. For both PCs the difference between dialect speakers and standard speakers 
was significant (PC1: t(1886) = —2.30, p = 0.021; PC2: t(1886) = —1.86, p = 0.063). 

As found by Jacobi (2009) a lower signal-to-noise ratio seems to lead to higher 
PC scores. Since the PCs are not comparable across the two different recording 
environments, the speakers of Standard Swedish were not included in any of the 
statistical analyses in Chapters 6 and 7. The Standard Swedish pronunciation is 
only included in the maps of each vowel in Appendix C. 
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5.1.7 Summary of the acoustic analysis 


In Figure 5.12 all the steps of the acoustic analysis of the vowel data are summarized. 
The acoustic analysis includes two steps which have been described above: 1) Bark 
filtering, and 2) data reduction of the Bark-filtered data with PCA. In a final step 
before analyzing dialectal variation in the vowels group averages of groups of speakers 
are calculated. 

The Bark filtering was done at nine sampling points in each vowel segment and 
the Bark-filtered spectra were level-normalized to 80 dB. Since all speakers had 
repeated each of the 19 elicited vowels 3-5 times during the interviews, average 
dB levels of the Bark filters were calculated per vowel per sampling point for each 
speaker. After the Bark filtering and the averaging, each of the 19 vowels of each 
speaker is represented by the average level-normalized intensities in the Bark filters 
at 9 sampling points. 

In the data reduction phase the data was split up according to speaker-sex. 
Separate PCAs were carried out for women and men with Bark filters 2-17 used as 
variables in the male analysis and Bark filters 3-18 in the female analysis. The PCA 
includes computing loadings based on point vowels, and subsequently computing 
scores for all of the 19 vowels. After the PCA each of the 19 vowels of each speaker 
is represented by two PCs at the 9 sampling points instead of by the dB levels in the 
frequency bands. The PCs of men and women are the output of two separate PCAs, 
but the scores are comparable because they represent the same latent variables in 
the vowel spectra. 

As explained in § 4.3 three different groupings of the speakers in the data set are 
used in the analysis of dialectal variation in this thesis: a) one group per site, b) 
older and younger speakers per site, and c) older women, older men, younger women, 
and younger men per site. For the analyses in the following chapters of this thesis, 
arithmetic group means were calculated for PC1 and PC2 respectively for each of 
the 19 vowels at the nine sampling points. After this averaging the pronunciation of 
the 19 vowels of each speaker group is represented by two PCs at 9 sampling points. 


5.2 Principal components versus formants 


Since formant measurements have a much longer tradition in variationist linguistics 
than the use of PCs of Bark-filtered spectra, a comparison of the PCs was made 
with formants. This comparison was also used for finding the optimal number of 
PCs to extract. 

Formants were measured for a subset of the SweDia data. Three sites were 
picked randomly: Ankarsrum, Malung and Skog. Formants were measured for all 
speakers from these sites. The data sets from Ankarsrum and Skog were complete 
with three speakers in each speaker group (older women, older men, younger women 
and younger men). From Malung, however, one young man was missing in the data 
set, which means that formants were measured for 18 women and 17 men in total. 
The formants were measured for the vowels in the stressed syllables of the words 
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Figure 5.12. Work flow of the acoustic analysis. 
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described in § 4.2.1: dis /i:/, disk /1/, dér /oe:/, dérr /ce/, flytta /y/, lass /a/, lat 
/a:/, leta /e:/, lett /e/, lott /o/, lus /w:/, las /lét /o:/, lér /ex/, lés /o:/, nat /e:/, 
sot /u:/, sark /e/, sdt /o:/, typ /y:/. In addition, the vowel /e/ elicited with the 
word ludd was used. In Ankarsrum [dt was used to elicit /o:/, while lés was used 
in Malung and Skog. One instance of every elicited vowel was measured for each of 
the speakers, which led to a total of 360 vowel pronunciations by women and 336 by 
men.’ 

The first three formants were measured in the center of each segment with the 
formant track function in the EMU® software. The window length was 25 ms and 
the window type Blackman. The nominal F1 value was set to 500 Hz for the male 
speakers and to 630 Hz for the female speakers. All measurements were inspected 
and errors made by the formant tracker were corrected manually. 


5.2.1 Correlation with formants 


Jacobi (2009, Ch. 3) found high correlations between PCs of Bark-filtered vowel 
spectra and formant measurements. The correlations are displayed in Table 5.6. 
The study was based on pronunciations of the Dutch phonemes /a/, /i/, /u/, /e/ 
and /ei/ by six female and six male speakers, measured close to onset and close 
to offset (2,767 speech segments in total). In a second study (Jacobi, 2009, Ch. 4) 
including more speakers (35 female and 35 male) and more vowel tokens (12,400) and 
a different set of vowels (Dutch /o:/, /e:/, /ei/, /au/ and /cey/) Jacobi found that 
the correlations were somewhat lower: PC1-F1 r = 0.70, PC2-F2 r = 0.72. In these 
studies, the formants were measured automatically without manual correction of 
the measurements, which means that the correlations were based on partly incorrect 
formant values. With corrected formant measurements we can expect to find even 
higher correlations. Moreover, Jacobi (2009) did not use any rotation technique to 
optimize the PC solution. Rotating the solution might influence the correlations 
between formants and components. 


Table 5.6. Correlations between formants and PCs of Bark-filtered spectra found 
by Jacobi (2009, Table 3.2, p. 38). 


correlations expl. var. 
Fl F2 F3 cum. 
PC Bark Bark Bark % % 
PC1 0.81 —0.12 0.26 65 65 
PC2 —0.08 0.70 0.10 25 90 


PC3 —0.19 0.05 —0.15 5 95 


“Four male speakers were missing one of the target vowels, which explains the total of 336 vowels 
instead of the expected 340 for the male data 

8The EMU Speech Database System. Version 2.1.1. By the Institute of Phonetics and Speech 
Processing, LMU Munich. <http://emu.sourceforge.net /> 
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Table 5.7. Correlations between formants and PCs in the unrotated solutions. 
Insignificant correlations (p > 0.05) are indicated by a hyphen. 


correlations expl. var. 

Fl F2 F3 cum. 
sex PC Bark Bark Bark % % 
men PCl 0.476 0.528 —0.182 42.4 42.4 

PC2 0.696 —0.644 - 35.1 77.4 
PC3 - —0.239 0.361 5.7 83.1 
women PCl 0.712 0.369 —0.125 49.7 49.7 
PC2 —0.506 0.731 0.239 28.4 78.2 
PC3 - —0.234 0.399 5.2 83.3 


The formant measurements of the subset of the SweDia data were correlated with 
the results of a number of PCA configurations in order to find the optimal PCA 
solution. PCA with and without varimax rotation was tested, as well as solutions 
with two and three extracted components. All PCAs were carried out using only 
point vowels in the initial phase and by analyzing data of men and women separately, 
as described in § 5.1. Since the PCAs were carried out separately on the vowels 
produced by men and women, also the correlations with formants were calculated 
separately for the two groups. 

Because there were more front vowels than back vowels in the data, the distribu- 
tion of the F2 values (and related PCs) was not completely normal. The relationship 
between formants and related PCs was still linear and the correlations were calcu- 
lated using Pearson’s correlation coefficient. All the correlations were also tested 
using the non-parametric Spearman’s rho, which did not lead to other conclusions 
about the relationship between formants and PCs. 

Table 5.7 shows the correlations of the three first PCs in the unrotated PC 
solutions with the three first formants. In both the female and the male analysis 
the two first PCs correlate highly with the two first formants. The correlation with 
F3 is smaller. Noticeable is that for both men and women PC1 does not correlate 
only with F1 (men: 0.476; women: 0.712) but also with F2 (men: 0.528; women: 
0.369). Similarly PC2 correlates with both F1 (men: 0.696; women: —0.506) and F2 
(men: —0.644; women: 0.731). When correlating F1 and F2 with each other there 
is no significant correlation in the female data set and a significant but very modest 
(0.119) correlation in the male data set. This means that both PC1 and PC2 catch 
variation caused by F1 and F2. This is not surprising when looking at the vowel 
plots of the unrotated solutions in Figure 5.9 (p. 67). The vowel plots are skewed 
in comparison to a formant plot, which explains that both PC1 and PC2 correlate 
with Fl and F2. 

A large difference between the female and male solution is that PC2 and F2 show 
a positive correlation for the women, but a negative one for the men. This can be 
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Table 5.8. Correlations between formants and PCs in the varimax solutions with 
three components extracted. Insignificant correlations (p > 0.05) are indicated by a 
hyphen. 


correlations expl. var. 

Fl F2 F3 cum. 
sex PC Bark Bark Bark % % 
men PC1 0.841 —0.443 = 39.0 39.0 

PC2 - 0.489 0.152 27.4 66.4 
PC3 0.181 0.522 —0.384 16.7 83.1 
women PC1 0.819 —0.442 —0.135 38.1 38.1 
PC2 - 0.504 0.345 25.3 63.4 
PC3 0.250 0.533 —0.326 19.9 83.3 


Table 5.9. Correlations between formants and PCs in the varimax solutions with 
two components extracted. Insignificant correlations (p > 0.05) are indicated by a 
hyphen. These solutions have the strongest association between PC1 and F1 and 
between PC2 and F2. 


correlations expl. var. 

Fl F2 F3 cum. 
sex PC Bark Bark Bark % % 
men PCl 0.880 —0.352 —0.185 41.1 41.1 

PC2 - 0.732 = 36.3 77.4 
women PCl 0.875 —0.360 —0.277 41.4 41.4 
PC2 0.152 0.744 - 36.8 78.2 


compared to the mirrored scores in Figure 5.9 (p. 67) and the mirrored loadings in 
Figure 5.3 (p. 63). 

When using varimax rotation, the number of components extracted influences all 
factors (see § 5.1.4). Tables 5.8 and 5.9 show the correlations of the rotated solutions 
when extracting three and two components respectively. In the solution with three 
components (Table 5.8) PC1 explains 39.0% of the variance for men and 38.1% for 
women. In the unrotated solutions in Table 5.7 the amount of explained variance 
on PC1 is considerably higher (men: 42.4%; women: 49.7%). The total amount of 
variance explained by three PCs, however, is the same (men: 83.1%; women: 83.3%). 
Because varimax maximizes the variance of the loadings within the extracted PCs, 
the relative importance of the PCs is equalized. 

In the varimax solution with three components (Table 5.8) PC1 and F1 correlate 
more strongly with each other than in the unrotated solution (men: 0.841; women: 
0.819). On the other hand, all three extracted components correlate considerably 
with F2 and there are only modest correlations between the PCs and F3. PCA does 
not seem to be able to completely separate the first three formants. 
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When extracting two components with varimax rotation the pairwise correlations 
between PC1 and F1 on the one hand, and PC2 and F2 on the other are considerably 
higher than in the other solutions (see Table 5.9). These correlations are also higher 
than the ones found by Jacobi (2009) (Table 5.6). F1 and PC1 correlate at a level of 
0.88 for both men and women, and the correlation between F2 and PC2 is 0.73-0.74. 
PC1 also partly explains variation caused by F2 and F3. 

When analyzing dialectal variation it is interesting to be able to draw articulatory 
conclusions about vowel pronunciation. The varimax solutions with two extracted 
components (Table 5.9) show the strongest direct relationship between PC1 and F1 
on the one hand and PC2 and F2 on the other. As can be seen in Figure 5.8 (p. 66) 
this solution also gives scores that correspond to the traditional vowel quadrilateral 
when plotting the scores in the PC2/PC1 plane. Because this model gave the highest 
correlations with formant measurements, it was chosen to be used throughout this 
thesis for analyzing dialectal variation. 

Figure 5.13 shows scatter plots of PC1 versus F1 and PC2 versus F2 of the 
varimax solutions with two extracted components, with separate regression lines 
for men and women. The linear relationship between F1 in Bark and PC1 is very 
strong. F2 and PC2 show a somewhat less strong relationship. The cloud is denser 
for high PC2/F2 values than for low values, which reflects the fact that the data 
comprises more front vowels than back vowels. But especially for the highest F2 
values, PC2 is in some cases lower than expected. While, F1 can be well predicted 
from PC1, PC2 apparently includes other spectral information than F2 only (see 
further § 5.2.3). Because of the strong relationship between PC1 and F1, PC1 can 
roughly be interpreted as representing vowel height. PC2 represents vowel backness 
to some extent, but the articulatory conclusions based on PC2 should not be as 
strong as those made for PC1. 

Jacobi (2009) measured high pairwise correlations between formants and PCs 
without using any rotation technique when extracting the PCs. One reason for 
this could be that the Dutch point vowels are different from the Swedish ones. 
Furthermore, the correlations of Jacobi (2009) are based on only five different vowels, 
while the current study includes 20 different vowel phonemes. This means that the 
correlations in the current study are based on a more varied and more continuous 
data set even though the total number of tokens is smaller. Because of this, the 
results of the two studies are not directly comparable to each other. 

Even though the correlations with formant measurements are high one should 
still bear in mind that using band-pass filtering means that rather broadly defined 
frequency regions determine the representation of vowel quality. Some finer differ- 
ences in formant frequencies between language varieties, possible to find by manual 
analysis and correction, may be lost. One advantage of using PCA of Bark-filtered 
spectra, however, is that the method can be completely automated, which makes it 
suitable for analyzing large data sets. 
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Figure 5.13. Scatter plots of PCs (2 components extracted with varimax rotation) 
versus formants. Regression lines are drawn separately for men and women. PC1 
shows a better fit with F1 than PC2 with F2. 
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Table 5.10. Results of the four multivariate analyses of variance: 7? for each 
significant factor (p < 0.05). 


2 F1-F2 F1-F2-F3 PC1-PC2 PC1-PC2 
- Bark Bark joined analysis sex separated 
vowel 0.875 0.743 0.735 0.750 
speaker-sex 0.499 0.690 0.193 0.024 
site 0.038 0.054 0.062 0.066 
vowel*speaker-sex 0.091 0.069 0.060 - 
vowel* site 0.346 0.278 0.240 0.257 


speaker-sex* site 0.031 0.069 0.066 
vowel*speaker-sex* site - — _ “ 


5.2.2 Multivariate analysis 


One further way of comparing PCA of Bark-filtered vowel spectra with formant 
frequencies is to analyze to what extent the methods are able to separate different 
sources of variation in the data. This was done by carrying out manovas with the 
measures of vowel quality as dependent variables and vowel, site and speaker-sex 
as independent variables. Four different manovas were carried out with different 
dependent variables: 


1. Fl and F2 measured in Bark 
2. F1, F2 and F3 measured in Bark 
3. two PCs extracted using varimax rotation separately for men and women 


4. two PCs extracted using varimax rotation including men and women in one 
single PCA 


In order to make the data as normally distributed as possible a few front vowels 
(the vowels elicited with the words disk, leta, nat, sérk and typ) were left out of the 
analysis. The independent variables of the analyses were vowel (with 15 categories), 
speaker-sex (men and women), and site (Ankarsrum, Malung, Skog). The total 
number of speakers was 35 (17 men and 18 women), and the number of vowel tokens 
was 523 (due to a few instances of missing records). The significance level was 
estimated using Pillai’s trace and the effect size was estimated by 7?, which shows 
the proportion of the total variance in the dependent variables accounted for by the 
independent variables. Table 5.10 shows the 1? values of the significant results of 
the four manovas. 

As expected all analyses show a significant main effect of vowel. The effect size of 
vowel is largest for the analysis with F1 and F2 as dependent variables (7? = 0.875). 
The effect size decreases when adding F3 to the analysis (n? = 0.743). This can be 
compared with the results of Adank (2003, 99-102), who used manova to compare 
the effect of a number of speaker normalization procedures. Her data consisted of 
recordings of the Dutch vowel phonemes by 160 speakers of Standard Dutch from 
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eight different geographic regions in the Netherlands and Flanders with an equal 
distribution of female and male and older and younger speakers. Adank found an 
effect. size of 7? = 0.893 for vowel when using two formants measured in Hertz as de- 
pendent variables, and 7? = 0.695 with three formants as dependent variables. The 
best performing normalization procedure, Lobanov’s (1971) z-normalization, gave a 
vowel effect of n? = 0.932 with two normalized formants as dependent variables and 
7? = 0.760 with three normalized formants as dependent variables. 

The main effect of vowel is smaller in the manovas with PCs as dependent vari- 
ables than when using the first two formants. The sex separated PCA (7? = 0.750) 
gives a somewhat larger effect of vowel than the PCA where men and women were 
analyzed together (n? = 0.735) and also larger than when using three formants. 

The formant-based manovas show a large main effect of speaker-sex. With three 
formants (7? = 0.690) this effect is even larger than with two (7? = 0.499), which was 
also found by Adank (2003, 101-102). F3 seems to have even more sex-dependent 
variance than F1 and F2, which explains why the main effect of vowel decreases 
when adding F3 to the analysis. 

In the PCs the effect of speaker-sex is much smaller than in formants. When 
men and women were analyzed separately there is hardly any effect of speaker-sex 
in the measurements of vowel quality (7? = 0.024), which confirms the results of 
§ 5.1.5. But also a PCA that includes both men and women gives a smaller effect 
of speaker-sex than formants measured in Bark (7? = 0.193). 

Jacobi (2009, 57) measured the area of /i — a— u/ vowel triangles of men and 
women based on formant measurements in Bark as well as PCs of Bark-filtered 
spectra. The formant measurements in Bark showed a significant difference in the 
sizes of the vowel spaces of men and women. The PCs, on the other hand, did not 
show any significant difference in the size of the vowel spaces of men and women; 
only the position of the vowel triangles differed between men and women in the PC 
plane. 

Applying PCA separately to men and women sets the mean of both groups to 
zero, which means that there is a correction for the different positions of the vowel 
spaces in the PC plane. Some speaker-specific variation is still left, but the biggest 
factor, sex, is removed. 

Using the first two formants for measuring vowel quality seems to lead to a bet- 
ter separation of vowels than in principal component solutions. However, formant 
measurements show large differences between men and women. The speaker and 
sex specific variation in formant measurements can be reduced by applying normal- 
ization procedures (see § 2.4.4). But successful speaker normalization procedures 
are generally based on the average and/or standard deviation of the vowel phon- 
emes, which makes comparison of language varieties with different phoneme systems 
or different vowel centers difficult. Using PCA on Bark-filtered spectra does not 
remove all speaker-specific variation in vowel pronunciation, but by applying PCA 
separately to male and female data the systematic differences caused by the ana- 
tomical/physiological differences between the sexes can be diminished. Moreover, a 
PCA can be built up on point vowels of a number of speakers and subsequently ap- 
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plied to a larger data set. Because of this, varieties with different phoneme systems 
and/or varieties lacking some of the point vowels can be compared to each other. 
This is an important precondition when comparing Swedish dialects, some of which 
have phoneme systems that deviate strongly from Standard Swedish. 


5.2.3 Interpreting principal components 


PCA of Bark-filtered vowel spectra was shown to correlate highly with formants 
in § 5.2.1. Exactly how the combination of a number of pass bands can result in a 
configuration so similar to formants is not completely straightforward to understand. 

Pols (1977) applied PCA to band-pass filtered vowel spectra and wrote (p. 49): 
“This mathematically well-defined factor representation is not as easy to interpret 
as a formant representation.” Pols (1977, 49-50) compared vowel spectra with the 
loadings of the PCA in order to explain the coordinate values of the vowels in the 
PC plane. This is also what is done in Figure 5.14; the average spectra of the point 
vowels are compared with the loadings of the extracted PCs. The figure is based 
on male speakers only, but it would look the same when based on female speakers, 
as is evident from Figures 5.2 (p. 61) and 5.7 (p. 65). Loadings > 0.6 and < —0.6 
on the PCs are marked with a symbol. This shows that frequencies up to 10 Bark 
largely determine PC1, with negative correlations with the two first Bark filters and 
positive correlations with Bark filters 4-10. Frequencies above 11 Bark are the most 
important ones determining PC2. 

The lines of the open vowels [ze:] and [a:]/[a:] are similar to the one of PC1 in 
starting low at the two first Bark filters and having a peak at 5-6 Bark. From this 
follows that they will have high positive scores on PC1. The close vowels [i:] and 
[ur], on the other hand, show more of a mirrored curve of the loadings of PC1 at 
the lower frequency regions. They start high but fall soon. Because they show the 
mirror of the loadings of PC1 they will get negative scores on this component. 

The back vowels [u:] and [a:]/[a:] have a low F2 at about 6-8 Bark. Because 
F2 is low there is only little energy in the higher frequency areas determining PC2. 
Accordingly, these vowels are assigned negative scores on PC2. The front vowels [i:] 
and [ze:], on the other hand, have a high F2; [i:] around 13 Bark and [ze:] around 
11 Bark. Thus, the second formant of the front vowels falls in the higher frequency 
regions that largely determine PC2. High energy levels in this frequency area lead 
to high scores on PC2. This reasoning is confirmed by looking at the scores of the 
point vowels in Figure 5.8 (p. 66). 

Some more insight can by acquired by looking at the interactions of PC1 and 
PC2 at the frequency areas of the formants. Table 5.11 shows the loadings of the two 
PCs of the male analysis. Additionally the table indicates within which Bark filter 
the average formant frequencies of Swedish long vowels produced by male speakers 
(Eklund & Traunmiiller, 1997) are placed.? 


*Since average formant frequencies are used, one should bear in mind that the actual variation 
spreads across the given Bark filters. 
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Figure 5.14. The loadings of the PCA (2 components extracted with varimax 
rotation; these are the same in all four graphs) and the mean intensities of each of 
the point vowels. For loadings between —0.6 and 0.6 the symbol is omitted. Plots 
are based on male data only. 
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The table shows that F1 of close vowels is defined by negative loadings on PC1. 
The frequency area of F1 of mid vowels has modest negative loadings on both ex- 
tracted PCs, and F1 of open vowels is defined by high positive loadings on PC1. 

F2 of back vowels has an average frequency in the same frequency area as F1 of 
the most open vowels. F2 of back vowels is, thus, defined by high positive loadings 
on PC1. The frequency area of F2 of front vowels has high positive loadings on 
PC2. At the frequency area of F2 of central vowels the loadings of PC1 and PC2 
cross each other (compare also to Figure 5.6, p. 65), which means that F2 of central 
vowels is defined by moderately high positive loadings on both PCs. 

Different weightings of the Bark filters on the two extracted components thus 
account for the varying frequencies of F1 and F2. The overlapping frequency areas 
of F1 of open vowels and F2 of back vowels explain the fact that the formants cannot 
be completely separated by the PCA, but PC1 correlates not only with F1 but also, 
to some extent, with F2 (see Table 5.9, p. 75). 

F3 of all vowels is defined by high positive loadings on PC2, with somewhat lower 
loadings for the highest F3 values. Variation in the intensity at the frequency area of 
F3, hence, influences PC2. But because the whole frequency area has high positive 
loadings, differences in the frequency of F3 is not likely to be well distinguished by 
PC2. 

Figure 5.13 (p. 77) shows that the relationship between PC2 and F2 is weaker 
than the one between PC1 and F1. As can be seen in Table 5.11, PC2 is strongly 
influenced by a frequency area higher than where F2 is found. PC2 is, hence, a com- 
bination of the effect of F2 and of higher frequency regions. The varying intensities 
in the higher frequency area most likely causes the variation in PC2 which cannot 
be explained by F2. 

Since Bark filters correspond to the critical bandwidth of human hearing and 
include information from the whole vowel spectrum, they model perception very 
well. Formant frequencies have on the other hand been shown by numerous studies 
to be very important cues for the perception of vowel quality. The PCs that result 
from reducing Bark filter data to the most important underlying components are 
influenced to a great deal by formants, as has been shown above. 

Results from perception experiments that specifically compare the role of form- 
ants respectively PCs of Bark-filtered spectra in perception would be interesting in 
order to understand the relationships even better. 
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Table 5.11. Mean formant frequencies of Swedish long vowels produced by male 
speakers according to Eklund & Traunmiiller (1997), and the loadings of PC1 and 
PC2 (two components extracted with varimax rotation; loadings >0.6 and < —0.6 in 
boldface). Multiple vowels within the same Bark filter are given in ascending order 
of formant frequency. 


- ss doadings——(‘ié‘é;S!”*”~””~~~ formant frequencies 
bf PCl PC2 Fl F2 F3 

2 —0.80 —0.04 

3 —0.85 —0.13 [yz] [iz] [ur] [ra] 

4 0.46 —0.31 [er] [o:] [o:] 

5 0.75 —0.14 [a] 

6 0.93 —0.02 [eer] [uz] [os] 

te 0.96 —0.06 

8 0.94 —0.09 [az] 

9 0.93 0.10 

10 0.76 0.54 

1 0.49 0.77 [2e:] [0] 

12 0.25 0.91 [a] 

13 —0.02 0.96 [y:] [iz] [er] 

14 0.03 0.94 [o:] [eer] [ur] [ar] 
15 0.03 0.90 [az] [or] [ez] [ys] 
16 0.12 0.83 iz] 

17 —0.11 0.79 


Chapter 6 


Analysis on the variable level 


In Chapter 5 a method for assessing vowel quality from acoustic speech samples by 
means of principal components (PCs) of Bark filtered vowel spectra was described. 
In the present chapter the described acoustic method is used for analyzing dialectal 
variation in Swedish vowel pronunciation. The variation in the acoustic variables of 
the 19 vowels described in § 4.2 is analyzed. Throughout this chapter the data is 
divided into two speaker groups per site: older and younger speakers. Each group 
includes approximately six speakers—three men and three women (see § 4.3). The 
arithmetic means of the speakers in the two speaker groups per site were calculated 
for PC1 and PC2 for each vowel (see the lowest part of Figure 5.12 on p. 72) and 
form the basis for all analyses presented in this chapter. 

A first impression of the variation is obtained by plotting the vowel data in 
the PC2/PC1 plane. Figure 6.1 displays one standard deviation ellipses’ of the 19 
Swedish vowels in the PC space. The data for drawing the ellipses comprised the 
average PC values in both speaker groups at each site measured at the temporal 
midpoint of the vowel segments. By using average values per speaker group for 
drawing the ellipses, the individual variation within the groups has been filtered 
out, and the ellipses show the amount of linguistic variation across sites and across 
the two age groups. The graphs gives an idea about the average position of each 
vowel in the PC space. The size and orientation of the ellipses indicate the amount 
of variation in each vowel and the main direction of the variation. For example, the 
vowel in dér has the largest ellipse, which means that this is the most variable vowel 
across sites and across the two generations. From the orientation of the ellipses we 
can see that, for example, the vowel in ndt varies more on the PC1 values than 
on PC2, while the vowel in lus varies more on PC2. Overlapping ellipses indicate 
that the pronunciation of the different vowels show a considerable amount of overlap 
across varieties. 


1Bllipses are drawn by applying PCA separately for each vowel with the acoustic PCs as input 
variables (Harrington, 2010, Ch. 6). The major and minor axes of the ellipses are the two first PCs 
of the data and the longest axis, hence, shows the direction that explains most of the variance. 
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In § 6.1, the variation across sites and across the two generations in each vowel 
is described. 

In § 6.2, the amount of variation per vowel is quantified, and the vowels are 
compared to each other based on the amount of variation across sites (§ 6.2.1), and 
across the two age groups (§ 6.2.2). The results show which of the vowels vary the 
most geographically and which vowels are changing in apparent time. 

A factor analysis was carried out in order to identify vowels with similar dis- 
tribution patterns. The results are presented in § 6.3. The factor analysis showed 
co-occurrence of a number of vowel features. Each extracted factor corresponds 
to a distinct geographic and/or generational distribution pattern in the data. By 
visualizing the factor scores on maps these distribution patterns were identified. 

In § 6.4 the results obtained by the different analyses in §§ 6.1, 6.2 and 6.3 are 
compared and summarized. The Swedish place names and area names used in the 
text are found in the maps in Figures 2.1, 2.2 and 4.1. 


6.1 Variation per vowel 


In order to get an idea of the pronunciation and geographical variation of the 19 
different vowels in the data set, maps visualizing the PC values of each vowel were 
created. The maps are found in Appendix C. In these maps the two extracted 
PCs (see Chapter 5) are visualized by means of a two-dimensional color spectrum 
(Figure B.2, p. 211). In Appendix B (§ B.2) the assignment and interpretation of 
the colors are explained in more detail. 

The acoustic analysis of the vowels was made at nine temporal points in every 
vowel segment in order to include as much information as possible about formant 
movements. For displaying the results of each vowel, however, only the first and 
the last sampling point were chosen, which taken together should give an indication 
of the overall vowel quality. The maps show the pronunciation as measured near 
onset (at 25% of the vowel duration) and near offset (75%). For each site in the 
data set, the average vowel quality of the older speakers and the younger speakers 
is visualized separately. The maps give an overview of the variation across sites 
and across the two generations for each vowel. The values close to onset give an 
indication of the basic vowel quality, while the degree of diphthongization can be 
studied by comparing the PC values measured close to onset of the vowels with the 
values close to offset. For comparison, a Standard Swedish reference point (by six 
older and six younger speakers of Standard Swedish) is included in the upper left 
corner of each map. 

Because of the problem of speaker-dependent variation in acoustic measures of 
vowel quality (§ 2.4.3), variation in vowel pronunciation across individual speakers is 
difficult to study. In this thesis a normalization of the variation related to speaker-sex 
is applied (§ 5.1.5) and averages of a number of speakers are used in order to reduce 
the individual variation related to anatomical/physiological differences. As long as 
all older speakers and all younger speakers at a site pronounce all vowels similarly, 
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Figure 6.1. The 19 vowels in the PC2/PC1 plane. The one standard deviation 
ellipses are drawn based on the average PC values of the two speaker groups (older 
and younger speakers) at each site measured at the temporal midpoint of each vowel. 
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using group averages gives a good representation of the vowel pronunciation. A risk 
with using averages is, however, that variation within speaker groups might be lost. 
For example, if there would be a case where half of the speakers in a group pronounce 
a vowel more close and the other half of the speakers more open, the group average 
would indicate a pronunciation between these two, which would actually be true 
for none of the individual speakers. Therefore, some caution has to be taken when 
interpreting group averages. In a study as large-scaled as the present one, with 
nearly a hundred sites and more than one thousand speakers, the group averages 
should still give a good indication of the geographic and generational variation. 

In this section the variation in each vowel is described and the maps are inter- 
preted. The maps give an overall impression of the dialectal variation. Because 
some fine-grained differences between sites are hard to detect in the color spectrum 
in the maps, the numeric values in the original data files were examined, too, for 
a thorough description of the variation. When unexpected results or outliers were 
found in the data, a comparison was made with the sound files in order to exclude 
the possibility that mistakes in the segmentation or in the acoustic analysis of the 
vowels would influence the analyses. 

In the appendix the maps are organized such that the corresponding” long and 
short vowels are placed adjacently with the long vowel first when both vowels are 
present in the data set. Front vowels are presented first, starting with the close 
unrounded front vowels and going on with more open and rounded front vowels. 
After the front vowels the back vowels are presented in the reversed order, that is, 
starting with the most open back vowels and ending with the close ones. Below, the 
vowels are presented in the same order as in the appendix, which is: dis (Standard 
Swedish /i:/), disk (/1/), typ (/y:/), flytta (/y/), leta (/e:/), lett (/e/), lus (/#:/), 
nat (/e:/), lér (the pre-/r/ allophone [ze:] of /e:/), sdérk (the pre-/r/ allophone [a] 
of /e/), s6t (/9:/), lés (/o:/), dér (the pre-/r/ allophone [ce:] of /¢:/), dérr (the 
pre-/r/ allophone [oe] of /o/), lat (/a:/), lass (/a/), las/lat (/o:/), lott (/9/), sot 
(/u/). 


6.1.1 dis /i:/ 


The PC values of the vowel elicited with the word dis are displayed in Figure C.1 
(p. 214). The Standard Swedish pronunciation of the vowels is [i:]. As can be seen 
in Figure 6.1, the vowel has low PC1 values and high PC2 values, which yields the 
blue colors in the maps in Figure C.1. The orientation of the ellipse of the vowel in 
Figure 6.1 shows the main direction of variation. The PC2 values vary more than 
the PC1 values across sites and generations, but the two variables co-vary to some 
extent. 

In the South Swedish area, the maps displaying vowel quality close to onset show 
somewhat lighter blue colors than the maps of vowel quality close to offset. This 
holds for both the older and the younger speakers. The lighter color indicates higher 


?That is, phonologically corresponding long and short vowels, written with the same ortho- 
graphic symbol. See Table 2.1 (p. 17). 
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PC1 values, that is, a more open vowel. Close to offset most of these dialects have 
clear blue colors. This fits well with the South Swedish diphthongization described 
in § 2.3.2.1. The long vowel is pronounced as a closing diphthong, which begins 
with a more open pronunciation and ends approximately with the Standard Swedish 
vowel quality. 

Younger speakers at many sites in Gétaland have lighter blue colors at both 
measuring points, which suggests a more open pronunciation throughout the vowel 
than in other varieties. A more open pronunciation than the Standard Swedish 
one was also shown to be the most frequent pronunciation of /i:/ by teenagers in 
the surroundings of Alingsas in Vastergotland (to the north-east of Géteborg) by 
Groénberg (2004). Relatively high PC1 values are also found for younger speakers in 
the province Narke (Stora Mellésa, Viby) and in the neighboring location Jarnboas 
(Vastmanland). 

An area comprising many Svealand varieties and the Finnish south coast has 
markedly low PC2 values (darker color). Because of the lower correlation between 
PC2 and F2 than between PC1 and F1, especially for high F2 values (see Figure 5.13, 
p. 77), it is not possible to tell if the lower PC2 values in this area are due to a low F2 
or some other spectral feature that these varieties share. Low F2 values could be an 
indication of the so-called “damped” i (§ 2.3.2.5), which is reported in many scattered 
dialects, for example in Uppland, south Bohuslin, south Ostergétland and Medelpad 
(Elert, 2000, 44-45). However, the damped pronunciation has not previously been 
attested in Finland-Swedish varieties and is therefore not likely to be found there. 


6.1.2 disk /1/ 


Figure C.2 (p. 215) shows the PC values of the vowel in the word disk. The Standard 
Swedish pronunciation of the vowel is [1]. Both the average PC1 and the average PC2 
values of the vowel in disk are somewhat higher than in the corresponding long vowel 
(the vowel in dis), as can be seen in Figure 6.1 (p. 87). The vowel shows relatively 
little variation across sites and generations. The PC2 values vary more than the 
PC1 values. 

Older speakers in the province Dalsland and on the Swedish west coast have 
higher PC1 values than what is found elsewhere. Younger speakers have somewhat 
higher PC1 values than older speakers in general. 

As was the case for the vowel in dis, the Svealand dialects and dialects along the 
Finnish south coast have lower PC2 values in disk than what is found in the rest of 
the language area. This holds especially for the older speakers. 

The differences between vowel quality close to onset and close to offset are very 
small. Only an area in the west of Godtaland shows a larger difference, especially 
for the older speakers. It is mainly the PC1 value that is higher close to offset than 
close to onset. 
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6.1.3 typ /y:/ 


The maps displaying the vowel in typ are found in Figure C.3 (p. 216). In Standard 
Swedish the vowel is a close rounded front vowel, [y:]. The average PC values are 
higher for the vowel in typ than for the Standard Swedish unrounded close front 
vowel in dis (see Figure 6.1, p. 87), but the main direction of variance is roughly the 
same. 

In the South Swedish area, a clear difference between the values close to onset and 
close to offset can be seen in the maps. The PC1 values are high close to onset and 
lower close to offset. The same thing was noted for the vowel in dis (§ 6.1.1). These 
closing diphthongs are part of the South Swedish diphthongization (see § 2.3.2.1). 
Also Narpes in Finland shows a similar kind of diphthongization in typ. 

As for the vowel in dis, younger speakers at many sites in G6taland and in Stora 
Mellésa (Narke), Viby (Narke) and Jarnboas (Vastmanland) have light blue colors at 
both measuring points, which suggests a relatively open pronunciation throughout 
the vowel in typ. 

The lowest PC2 values are found in Svealand and on the Finnish south coast, as 
well as in Bohuslan. 


6.1.4 flytta /y/ 


The PC values of the vowel elicited with the word flytta (y) are displayed in Fig- 
ure C.4 (p. 217). The Standard Swedish pronunciation is [vy]. The PC1 values are 
higher than for the long y in typ resulting in lighter colors in the maps. The average 
PC2 values are lower than for the typ vowel. Both the PC1 and PC2 values of the 
flytta vowel vary a great deal and independently of each other giving the almost 
round ellipse in Figure 6.1 (p. 87). The maps showing pronunciation close to onset 
and close to offset are relatively similar. 

Of the short front close vowels, the flytta vowel shows considerably more variation 
than the disk vowel. A western area, from the province of Bohuslan in the south to 
Jamtland in the north, including some sites on the east coast of Norrland, has high 
PC1 values, resulting in lighter blue colors and suggesting a more open pronunciation 
than in Standard Swedish. For younger speakers in G6taland the PC1 values are 
not as high as for the older speakers. In the more northern provinces, however, the 
lighter colors are found in both generations. This opening of the vowel in flytta 
could be the result of a change from Proto-Nordic short i and y to e and 6, which 
is considered a typical feature for Gétaland dialects (Pettersson, 2005, 151, 224).° 
Pamp (1978, 88) notes that this southern feature has also spread to parts of the 
province Varmland. In Harjedalen, Pamp (1978, 120) mentions a change from y to 
6 in front of some consonants or consonant combinations (namely nd, ns, nt, m, v 
and f), but he does not mention a change in front of t. For the province Jaémtland, 
Pamp (1978) does not mention a lowering of y. 


3 Dialects in Svealand and further to the north were also affected by this change, but in a more 
restricted number of phonological contexts. 
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When it comes to the PC2 values, lower values than in other varieties are found 
in a similar area as for the other close front vowels, that is, mainly in Svealand and 
on the Finnish south coast. Very low values are also found in Skee (Bohuslan). The 
highest PC2 values are found in Norrland. 


6.1.5 leta /e:/ 


The root vowel, e, in the word leta is displayed in Figure C.5 (p. 218). The Standard 
Swedish pronunciation is [e:]. The vowel shows considerable variation across variet- 
ies, especially on the PC1 values (see Figure 6.1, p. 87). The maps displaying the PC 
values near onset and near offset are quite different. There is a general trend towards 
lighter colors close to offset than close to onset, which means higher PC1 values and 
a more open pronunciation close to offset. But other types of diphthongization can 
be identified as well. 

PC1 values that increase towards offset are especially prominent among younger 
speakers. A similar opening of /e:/ towards the end of the vowel was also found in 
formant measurements of speakers from the greater Stockholm area by Eklund & 
Traunmiiller (1997) (see § 2.3.1). 

An opposite kind of diphthongization, high PC1 values close to onset that de- 
crease towards offset, is found among South Swedish varieties (mainly in Skane), 
on Gotland, in Jamtland (Aspas, Berg, Are) and Norrbotten in Norrland, and in 
Houtskir (Aboland) and Narpes (Osterbotten) in Finland. The highest PC1 values 
close to onset are found in Narpes. Older speakers in Overkalix (Norrbotten) have 
a large difference between onset and offset in the PC2 values, too. The vowel in leta 
was a diphthong, /ai/, in Proto-Nordic. The Proto-Nordic diphthongs were monoph- 
thongized in most varieties of Swedish, but were preserved in dialects on Gotland, 
parts of Norrland and in Finland (except Aland) (Pettersson, 2005, 211). In the 
present data set the diphthongization values are lower for younger speakers than for 
older speakers in all sites where the Proto-Nordic diphthong has been preserved. It 
seems that the diphthong is disappearing in Norrland. Younger speakers on Gotland 
have higher diphthongization values than the younger speakers in Norrland, but the 
difference between values near onset and near offset is still so small for the younger 
speakers on Gotland that it is hardly recognizable in the color spectrum in the maps. 
This could mean that only a few of the younger speakers on Gotland are using the 
variant with the diphthong. In Finland the diphthong is only found at two sites in 
the data set. At these two sites it seems to be stable, that is, present also among 
younger speakers. 

In South Swedish, Proto-Nordic diphthongs have not been preserved, but the 
diphthongization of /e:/ is part of the South Swedish diphthongization (§ 2.3.2.1), 
which affects all long vowels. 

For PC2, a very similar pattern as for other front vowels, discussed above, is 
found. 
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6.1.6 lett /e/ 


In Figure C.6 (p. 219), the maps display the PC values of the vowel in lett. In 
Standard Swedish the vowel is pronounced [e], that is somewhat more open than 
the corresponding long vowel [e:].. Also in many of the dialects the vowel in lett 
is more open than the vowel in leta, as can be seen by the lighter colors in the 
maps. Figure 6.1 (p. 87) shows that the PC1 values vary more than the PC2 values, 
exactly as for the corresponding long vowel. For the short vowel the differences 
between values close to onset and close to offset are not as large as for the long 
vowel. The PC1 values show more diphthongization than the PC2 values. 

The maps of the younger speakers have more light colors than the maps of the 
older speakers. This is due to higher PC1 values, which suggests a more open 
pronunciation among younger speakers. At almost all sites (but eight), the younger 
speakers have higher PC1 values than the older speakers. 

The older speakers in Sproge (Gotland) and Overkalix (Norrbotten) have very 
high PC1 values and low PC2 values close to onset. The PC1 values decrease and 
PC2 values increase towards offset. Like in leta (§ 6.1.5) the vowel in lett was the 
diphthong /ai/ in Proto-Nordic. The PC values of Sproge and Overkalix suggest 
that the Proto-Nordic diphthong has been preserved in these varieties. A similar 
diphthongization on PC1, but to lesser extent, is found for example in Frostviken 
and Strémsund in Jamtland, Sédra Finnskoga (Varmland), Piteé (Norrbotten) and 
Ankarsrum (Smaland). For the short vowel in lett fewer dialects have preserved the 
diphthong than for the long vowel in leta. 

Diphthongization along PC1 into the opposite direction is found in Vastergét- 
land. Here, the PC1 values are relatively low close to onset, but increase remarkably 
resulting in a very light color close to offset. 

The PC2 values show a similar pattern as for other front vowels, that is, low 
values are found at many sites in Svealand and along the Finnish south coast. 


6.1.7 lus /u:/ 


The vowel in lus is displayed in Figure C.7 (p. 220). The linguistic variation is 
considerable, especially along PC2, as can be seen also in Figure 6.1 (p. 87). 

In east Svealand and at some sites in western parts of Norrland, younger speakers 
have higher PC1 values than older speakers, suggesting that the pronunciation of the 
vowel is becoming more open. A development of /#:/ to a more open, more fronted 
and less rounded vowel in the town Eskilstuna in S6dermanland (about 100 km 
west from Stockholm) has been described by Nordberg (1975). This development 
co-varies to some extent with the opening of /g:/ in Eskilstuna (see § 6.1.11). The 
higher PC1 values among younger speakers as compared to older speakers in the 
present data set could indicate a similar development. 

At many sites in G6taland and Norrland, the more open pronunciation is found 
among both older and younger speakers. An open pronunciation of /#:/ among 
teenagers in Vastergétland has been described previously by Grénberg (2004). 
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Some of the sites along the Finnish south coast have a dark yellowish color very 
different from the color of most sites in Sweden. These dialects have very low PC2 
values. As mentioned in § 2.3.2.7, the pronunciation of the lus vowel was [u:] in 
Proto-Nordic, but the pronunciation has been fronted during the centuries. In many 
dialects in Finland the fronting has not proceeded as far as in most parts of Sweden. 
Markedly low F2 values in /u:/ in Finland-Swedish as compared to the standard 
language in Sweden have been measured by Reuter (1971) and Kuronen (2000). 
Apart from in Finland, low PC2 values are also found in Skee and among older 
speakers in Malung. In Skee the PC1 values are lower resulting in a darker color. 

Diphthongized pronunciations of the lus vowel are found in South Swedish vari- 
eties, on Gotland and in a few other scattered sites. In the South Swedish area, 
the PC1 values decrease during the vowel (going from gray to blue on the maps), 
indicating a closing vowel. On Gotland, on the contrary, the PC1 values increase 
during the vowel segment, and the change in PC2 values is larger than for PC1. 
The PC2 values decrease during the vowel on Gotland. Elert (2000, 42) mentions 
the pronunciation [*#] for /#:/ in standard-like speech on Gotland and Pamp (1978, 
77) describes the pronunciation of /#:/ in Gotlandic dialects as eo (IPA: [eu]) or do 
(IPA: [eu]). The low PC1 values in the varieties on Gotland in Figure C.7 suggest a 
relatively close pronunciation close to onset. 

In Norrbotten, diphthongization is found among older speakers, but the four sites 
in Norrbotten have different types of diphthongization. A similar diphthongization 
to that on Gotland is found for older speakers in Nederlulea. In Overkalix the PC1 
values close to onset are higher than in Nederluleaé and the change in PC2 is smaller. 
In Kalix the PC1 values close to onset are very high and decrease towards offset, 
while the change in PC2 is small. 

In Vora (Osterbotten), there is a relatively large change in PC2 between onset 
and offset (higher close to onset than close to offset) and the PC1 values are relatively 
high. Older speakers in Dragsfjiird (Aboland) have higher PC2 values close to onset 
than close to offset, too. 


6.1.8 nat /e:/ 


The vowel in nat has the highest average PC2 value of the vowels in the data set (see 
Figure 6.1, p. 87). The PC1 values vary more across dialects than the PC2 values. 
The vowel is relatively open (Standard Swedish [e:]) resulting in light colors in the 
maps in Figure C.8 (p. 221). 

In the map of vowel quality close to onset of the older speakers, an eastern area 
can be distinguished from the rest. This area includes Uppland, Gotland and many 
Finland-Swedish varieties except for the ones spoken on the Aland islands. The sites 
in this area have a darker color than most other sites, which means lower PC1 values 
and suggests a closer pronunciation. The dialects close to Stockholm, in Uppland 
and Sdédermanland, and some dialects in Finland are known from the literature (see 
§ 2.3.2.6) to lack the distinction between /e:/ and /e:/, and both phonemes are 
pronounced [e:]. On Gotland the pronunciation of /e:/ is close to [e:], too, but the 
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phonemic distinction is maintained because /e:/ is diphthongized, [ei] (Elert, 2000, 
47). 

Outside the area mentioned, low PC1 values are found also in Malung in Dalarna 
and in Skee and Orust in Bohuslin. 

On Gotland and in Narpes (Osterbotten), the PC1 values are much lower close 
to onset than close to offset, indicating an opening diphthong in these dialects. This 
is in line with the Gotlandic diphthongization of mid vowels (see § 2.3.2.3). From 
Narpes the pronunciation [nte:t] is reported in Ordbok éver Finlands svenska folkmél 
(Ahlback & Slotte, 1976-2007). 

In the South Swedish area, there is also diphthongization, especially among the 
older speakers. This diphthongization goes in the opposite direction than the one on 
Gotland: the PC1 values are higher close to onset than close to offset. This matches 
the description of South Swedish diphthongization (§ 2.3.2.1): long front vowels are 
pronounced as closing diphthongs that start with a more open pronunciation and 
end approximately with the Standard Swedish vowel quality. 

The maps of the younger speakers generally have lighter colors than the maps of 
the older speakers. Only at seven sites (Borga, Burseryd, Fole, Jamshég, Malung, 
Snappertuna, Ossjé) do younger speakers not have higher PC1 values close to onset 
than the older speakers. The dialects in the surroundings of Stockholm are the ones 
that seem to be changing the most. Elert (2000, 47) notes that the close [e:] in words 
like ndt for a long times has been regarded as not generally accepted Stockholm 
pronunciation and that this pronunciation is decreasing in use. As reasons for the 
change, Elert mentions the general trend towards a more orthographic pronunciation 
and that the merger of /e:/ and /e:/ has not been accepted by school teachers. 

The diffusion of an even more open pronunciation of /e:/ (in other contexts than 
before /r/) has been noted among teenagers in Eskilstuna (Hammermo, 1989) and 
Stockholm (Kotsinas, 1994). In both studies, the extremely open pronunciation [z:] 
was used more by girls than by boys. 

Young speakers on Gotland, in Finland and in Malung (Dalarna) seem to hold 
on to a close pronunciation of the nat vowel. From being an eastern dialect feature, 
the close pronunciation of the nat vowel has, hence, changed to a peripheral feature. 


6.1.9 lar [ze:] 


The vowel in the word ldr is displayed in the maps in Figure C.9 (p. 222). In 
Standard Swedish the vowel is an open front vowel, [ee:] (the pre-/r/ allophone of 
/e:/). In the dialects both the PC1 and PC2 values are high, resulting in light 
colors in the maps. The PC1 values vary more than the PC2 values, as can be seen 
in Figure 6.1 (p. 87), but the oblique direction of the ellipse shows that the PCs 
vary to some extent dependently of each other: higher PC1 values mean lower PC2 
values. 
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The maps show that there is a large difference between older and younger speak- 
ers. The maps of the younger speakers are lighter, indicating higher PC1 values 
which means a more open pronunciation. 

In the maps of the older speakers an eastern area can be distinguished. This area 
includes Uppland, Gotland and Finland. In this area light yellow or grayish shades 
(high PC1) are found, while the rest of the language area is blue. The high PC1 
values correspond to the open Standard Swedish pronunciation of the vowel [z:]. 
The lowest PC1 values are found at a few sites in Norrland (Aspas, Farila, Kalix, 
Nederlulea, Strémsund) in Houtskir (Aboland) and around the lake Vanern. In 
the area around Vanern the PC1 values increase towards offset, while the difference 
between onset and offset does not seem to be that large in most of the Swedish area. 
The PC2 values vary less among the older speakers than the PC1 values and they 
are more stable between onset and offset. The eastern area (especially Uppland and 
Finland) has somewhat lower PC2 values than the other dialects, as shown for other 
front vowels. 

Elert (2000, 48-49) notes that almost all varieties of Standard Swedish have a 
more open pronunciation of /e:/ when it is immediately followed by an r than in 
other positions, but the degree of openness of the more open allophone varies a lot. 
According to Elert, the most open pronunciations are found in the east, for example 
in Stockholm. More to the west, the vowel can be relatively close, even to the degree 
that the pronunciation of /e:/ is the same in all positions. Comparing the maps of 
nat (Figure C.8) and lér (Figure C.9) of the older speakers, it is obvious that the 
largest difference between the two vowels is made in Uppland, Gotland and Finland.* 
For many sites in G6taland and Norrland the color is very similar for the two vowels, 
suggesting only a small or no difference. 

Nordberg (1975) mentions that there is a social stratification of the pronunciation 
of /e:/ in the town Eskilstuna in S6dermanland. In Nordberg’s study, the highest 
social group followed the Standard Swedish norm by using [z:] before /r/ and [e:] in 
other contexts. In the lower social group, however, younger speakers had generalized 
[eer] in all context, while older speakers used [e:] in all contexts. 

What Elert (2000) describes holds for the lar vowel of older speakers in the 
present data set. In the east the pronunciation is more open than in the west. But 
in the younger generation a very different picture emerges. The yellow color that is 
found only in the east for the older speakers has spread markedly among younger 
speakers. At all sites except five (Fole, Far6, Gras6, Haraker and Narpes) in the 
data set the younger speakers have on average higher PC1 values close to onset than 
the older speakers. A similar trend was noted for the ndt vowel in the previous 
section. The general opening of /e:/ by younger speakers, which Nordberg (1975) 
found in the lower social group in Eskilstuna and which Hammermo (1989) and 
Kotsinas (1994) noted especially for young girls in Eskilstuna and Stockholm (see 
above § 6.1.8), seems to have spread to almost the whole language area. 


4In Uppland and Finland the pronunciation of /e:/ is so close that the phonemes /e:/ and 
/e:/ have merged. On Gotland the pronunciation of /e:/ is very close, too, but because /e:/ is 
diphthongized, there is no merger. See § 6.1.8. 


96 Chapter 6. Analysis on the variable level 


In the eastern area, where the older speakers have an open pronunciation of the 
lar vowel, the difference between older and younger speakers is small. In Kalix and 
Nederlulea in Norrbotten (where the older speakers have markedly low PC1 values) 
the difference between older and younger speakers is the largest. 


6.1.10 sark [ze] 


The vowel in the word sdrk is one of the least variable according to the ellipses 
in Figure 6.1 (p. 87). The vowel is an open front vowel in Standard Swedish, [a], 
and the high PC1 and PC2 values suggest that the same is true for most dialects. 
Accordingly, the maps in Figure C.10 (p. 223) show only light blue and light yellow 
colors. 

The PC1 values are somewhat higher in Norrland than in Svealand and Gotaland. 
This is more true for the older speakers than for the younger speakers. As for other 
front vowels, the lowest PC2 values are found in many places in Svealand. 

An area including sites in the provinces Narke, S6dermanland and Vastmanland 
has lower PC1 values and higher PC2 values than most other areas, resulting in a 
more blue color. Relatively low PC1 values close to onset are also found for the older 
speakers in Overkalix (Norrbotten) and for older speakers in the South Swedish area. 

Differences between onset and offset are generally not large in sdrk. In the South 
Swedish area, however, the PC1 values increase towards offset. 


6.1.11 sdt /o:/ 


The pronunciation of the vowel in the word sét (Standard Swedish [9:]) is relatively 
homogeneous among older speakers. The blue color throughout the map of the 
pronunciation of older speakers in Figure C.11 (p. 224) indicates high PC2 values. 

There is a general tendency towards a more open pronunciation (higher PC1 
values) close to offset, compared to that of onset, manifested in lighter colors in 
the maps displaying the pronunciation close to offset. This is in line with the diph- 
thongization of mid vowels in Stockholm speech reported by Eklund & Traunmiiller 
(1997) (see § 2.3.1). Dialects in the provinces Skane, Gotland and Norrbotten go 
against this general trend having a more close manner of articulation at offset than 
at onset, indicating a closing diphthong. 

The maps show a large difference between older and younger speakers in some 
areas. Younger speakers have higher PC1 values than older speakers in general, 
which indicates that the pronunciation is becoming more open. Nordberg (1975) 
describes the development of a more open pronunciation of /g:/ in Eskilstuna in 
Sédermanland. Eskilstuna was industrialized in the 19th century, with rapid growth 
and immigration from the surrounding countryside as consequences. In the sur- 
rounding rural dialects there was no open allophone of /g:/, but the vowel was a 
mid-close vowel in all contexts. When the immigrants in Eskilstuna wanted to im- 
itate the more prestigious Stockholm pronunciation, they started to use the open 
allophone [oe:] of /g:/, which is used only before /r/ in Standard Swedish. However, 
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they overgeneralized the open allophone and used it in positions other than before 
/r/ as well. This “socio-linguistic hypercorrection” (Nordberg, 1975, 603) became a 
well known feature in the local vernacular of Eskilstuna, but the pronunciation was 
heavily stigmatized and it was not accepted in schools. In Nordberg’s data from the 
end of the 1960s, speakers in the lower social group had a more open pronunciation 
than the higher social group. However, there was also a correlation with age: the 
younger the speaker, the more open the pronunciation. Men used more open vari- 
ants than women, and the youngest men (age 16-30) in the higher social group had 
an as open pronunciation as the young speakers in the lower social group. The open 
pronunciation was, hence, spreading from the lower social group to the higher social 
group, which suggested a change from below was in progress. In a study of school 
children recorded in Eskilstuna in 1977-79 Hammermo (1989) found similar social 
stratifications and age-related variation in the /g:/ vowel as Nordberg (1975), but 
in Hammermo’s data the young girls were using more open variants than the boys. 

Nordberg (1975) mentions that apart from in Eskilstuna, a more open pronunci- 
ation of /g:/ was spreading quickly among younger speakers in central Sweden. In 
Stockholm, this open pronunciation was considered rural and was stigmatized. How- 
ever, in a study of the language of teenagers in Stockholm recorded in 1989-1991, 
Kotsinas (1994) showed that the open pronunciation of /g:/ had become regular 
among both lower-class and upper-class teenagers. Andersson (1994) noted that an 
open pronunciation of /g:/ was becoming common among young speakers in Gote- 
borg, too, and Grénberg (2004) characterized the open variant of /g:/ as marker of 
Swedish youth language, particularly associated with the language of young people 
in the cities. 

In the maps in Figure C.11 the most open pronunciations among older speakers 
are found in Medelpad, Hiirjedalen, Aland, Griisé (Uppland) and northern parts of 
Ostergétland. The shift towards a more open pronunciation seems to be strongest 
in central Sweden and along the coast in the west. In this area, younger speakers 
have significantly higher PC1 values and to some extent also lower PC2 values than 
older speakers. Also for the speakers who represent Standard Swedish a clear dif- 
ference between older and younger speakers can be noted. Norrland and mainland 
Finland are less affected by this opening of the sét vowel and so are the provinces 
Vastergotland and Dalsland. 

From previously having been a socio-linguistic hypercorrection and a stigmat- 
ized variant, the open pronunciation of /9:/ has spread to a large geographic area 
distinguishing younger speakers from older speakers. 


6.1.12 lds /d:/ 


Even though the vowels in the words lés and sét are the same vowel phoneme in 
Standard Swedish, /9:/, the vowel in lds varies more across dialects than the vowel 
in sét, as can be seen by the size of the ellipses in Figure 6.1 (p. 87). The maps 
displaying the vowels in sét (Figure C.11, p. 224) and lés (Figure C.12, p. 225) look 
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relatively similar for the younger speakers, but differ in some areas for the older 
speakers. 

In Proto-Nordic lés had a diphthong, while sé¢ had a monophthong (see § 4.2.1). 
Some dialects have preserved the vowels as two different phonemes. In the maps, dif- 
ferences between the two vowels are found mainly among older speakers in Norrland. 
Above all the difference in the provinces Jamtland and Harjedalen is striking. The 
diphthongization of the vowel in /és is strongest among older speakers in Kalix and 
Nederlulea in Norrbotten and Burtrask in Vasterbotten. In Kalix and Nederlulea 
the PC2 values decrease towards offset, while the PC1 values increase. In Burtrask 
the PC1 values decrease considerably towards offset. 


6.1.13 d6r [oer] 


The vowel in ddr has the largest ellipse in Figure 6.1 (p. 87), and is, hence, the most 
variable vowel. Both the PC1 and the PC2 values show considerable variation and 
there is a correlation between the two variables: higher PC1 values generally go with 
lower PC2 values. The maps in Figure C.13 (p. 226) show both large geographic 
variation and variation across the two age groups of speakers. The Standard Swedish 
pronunciation is [ce:]. The clear blue color in many areas in the maps suggests a 
much more close pronunciation than in Standard Swedish. The difference between 
values close to onset and close to offset are also large for some dialects. 

The PC1 values are generally lower close to onset than they are close to offset and 
for most varieties the PC2 values decrease towards offset. Among older speakers, 
high PC1 values are found mainly in an eastern area. This includes almost the whole 
east coast of Sweden, with the islands Gotland and Oland and many of the sites in 
Finland. The South Swedish varieties have high PC1 values as well. 

Relatively low PC1 values and high PC2 values close to onset are found among 
older speakers in most of G6taland and in provinces close to the Norwegian bor- 
der (Varmland, Harjedalen). A close pronunciation of /¢:/, also before /r/, was 
considered typical for the dialects of Vasterg6tland—the core area of G6taland—by 
Landtmanson (1952, 38-39). 

There is a large difference between older and younger speakers in the dér vowel. 
The younger speakers have higher PC1 values than the older in all but eight of 
the sites in the data set. Only in a small western area are low PC1 values found 
among the younger speakers. This area includes the provinces Vastergétland and 
Dalsland. Grénberg (2004) studied the variable ‘/9:/ before /r/’ among teenagers in 
Vasterg6tland and found a surprisingly high frequency of the local close variant of the 
vowel. While some other features of the traditional local dialect were disappearing 
rapidly, the close pronunciation of /g:/ before /r/ seemed persistent, which led 
Gronberg to conclude that “there is a chance that it would live on as a part of a 
West Swedish or Vastg6ta regional standard” (Grénberg, 2004, 344). This statement 
seems to be supported by the results in the maps in Figure C.13 (even though the 
subjects of Grénberg’s study were recorded at about the same period in time as the 
subjects of this thesis which means that no diachronic conclusions can be made). 
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Many of the Norrland varieties (particularly older speakers) have very high PC2 
values. Dialects in Uppland, Gotland and along the Finnish south coast have low 
PC2 values. Among younger speakers, the lowest PC2 values are found on Gotland. 
In Uppland and on the Finnish south coast younger speakers have somewhat higher 
PC2 values than older speakers. 

In Osterbotten and Houtskir (Aboland) in Finland and among older speakers 
in Norrbotten and Jémtland the PC1 values are higher close to onset than close to 
offset. The decreasing PC1 values suggest that the Proto-Nordic diphthong, /eu/, 
has been preserved. In Norrbotten, in Narpes (Osterbotten) and Sproge (Gotland) 
the PC2 values are also considerably lower close to onset than close to offset. 

South Swedish varieties have higher PC2 values near onset than near offset of 
the vowel. 


6.1.14 dérr [ce] 


The vowel in dérr is displayed in Figure C.14 (p. 227). In Standard Swedish the 
vowel in this word is the open allophone [ce] of the phoneme /¢/. This vowel is 
known for its variability across dialects. In several places, especially in cities there 
is a merger of /g/ and /e/. The merger is more common before /r/ than in other 
contexts (Elert, 2000, 48). 

The maps in Figure C.14 show clear geographic variation for the dérr vowel, but 
the variation is relatively stable across the two generations of speakers. The vowel 
in dérr varies less than the vowel in ddr, as shown in Figure 6.1 (p. 87). Still, the 
geographic variation in the two vowels shows some similarities. Large differences 
between the two vowels are found particularly in Gétaland and western parts of 
Svealand, where many dialects have higher PC1 values and lower PC2 values in the 
dérr vowel than in the dér vowel, which means dérr has a more open pronunciation 
than dér. In Uppland the PC1 values are lower for the dérr vowel than for the dér 
vowel. 

Because /e/ is not included in the data set (see § 4.2.2), it is not possible to make 
conclusions about a possible merger of the two vowels. However, gray colors on the 
maps, for example in Uppland, indicate a central vowel. A central pronunciation of 
/9/ is likely to be rather similar to the pronunciation of /e/. 

The differences between values close to onset and close to offset are smaller for 
the dérr vowel than for the ddr vowel. Large differences are found in Harjedalen 
and Jaémtland and in Narpes in Finland, where the PC1 values are very low close to 
onset and increase towards offset. In the South Swedish area the PC1 values increase 
and PC2 values decrease towards offset. 

The vowel in dérr is not unambiguously a short vowel in dialects even though it is 
a short vowel in Standard Swedish. In some dialects the vowel has been lengthened. 
Transcriptions of the present data set show that this is the case at least for dialects 
in Jaémtland and Harjedalen and for some dialects in Dalarna. 
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6.1.15 lat /a:/ 


The vowel in lat is an open back vowel in Standard Swedish. The position of the 
ellipse of the vowel in the PC2/PC1 plane in Figure 6.1 (p. 87) and the yellow 
color in the maps in Figure C.15 (p. 228) suggest a similar pronunciation for most 
dialects. The dialectal differences are relatively small. When it comes to the PC1 
values, there is only one outlier: the older speakers in Overkalix (Norrbotten) have 
an extremely low value compared to all other varieties. 

High PC2 values are found among older speakers in Norrbotten and Vasterbot- 
ten. Also in the South Swedish area very high PC2 values are found, Vaxtorp and 
Ossjé having the most extreme values. On Gotland and in Finland the PC2 values 
are relatively high compared to most other varieties, too. 

In a cluster analysis of Swedish dialects based on Mel frequency cepstral coef- 
ficients of the vowel in lat, Lundberg (2005) identified three dialect clusters.° The 
first cluster, representing a “broad [a:] sound” according to Lundberg (2005, 43), 
included sites in northern Sweden, Finland, Skane and Gotland. This cluster agrees 
well with the sites with high PC2 values in the lat vowel in the present analysis. 

According to Lundberg (2005, 43), the two other clusters identified distinguished 
a more rounded pronunciation from the Standard Swedish one. This division could 
not be identified in the present analysis. 

The vowel in lat was a short vowel in Proto-Nordic and was lengthend during 
the Swedish quantity shift. Of the sites included in the present study, at least Vora 
(Osterbotten) has preserved a short vowel phoneme in lat. 


6.1.16 lass /a/ 


Figure C.16 (p. 229) displays the maps of the lass vowel. The total amount of 
variation is very small, which can also be seen in Figure 6.1 (p. 87). The PC2 values 
vary more than the PC1 values. The a in lass (Standard Swedish [a]) has higher 
PC1 and PC2 values than the a in lat (Standard Swedish [a:]). 

In the South Swedish area, the PC1 values are lower than the average, while 
the PC2 values are relatively high, resulting in grayish colors. Bruce (2010, 139) 
mentions that the short a has a fronted pronunciation in southern Skane, close to [a], 
which fits with the PC scores. Similar scores as for the South Swedish varieties are 
also found for some sites in Norrland (for example, Delsbo, Nederluled, Overkalix), 
especially for the older speakers. 


6.1.17 1as/lat /o:/ 


The vowel elicited with the words lds and lat, displayed in Figure C.17 (p. 230), is a 
close-mid back vowel in Standard Swedish, [o:]. Also in most dialects the vowel is a 
back vowel, with colors between black and yellow in the maps (the extremely dark 
color of the Standard Swedish reference point is a result of the higher signal-to-noise 
ratios in the recordings of the standard speakers, compare § 5.1.6). 


5The data of the study comprised older male speakers from the SweDia database. 
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The most striking in the maps is the bluish color close to vowel onset for South 
Swedish varieties. These varieties have extremely high PC2 values close to onset. 
The high PC2 values are most extreme in the province of Skane. Close to the 
offset these dialects have much lower PC2 values, which indicates a diphthongized 
pronunciation. The PC1 values found here are also high compared to other dialects, 
which suggests a relatively open pronunciation. The PC1 values do not differ much 
between onset and offset. The South Swedish pronunciation of Standard Swedish 
/o:/ is, according to Elert (2000, 38), [fo]. The PC values fit this pronunciation well. 

Other varieties with a diphthongized pronunciation are found on Gotland, where 
both the PC1 and the PC2 values are higher close to offset than close to onset. Elert 
(2000, 42) gives the pronunciation [o:*] for Standard Swedish /o:/ on Gotland, which 
agrees with the PC values. 

The lowest PC1 values and the lowest PC2 values close to onset for the lds /lat 
vowel are found in Svealand. Both PC values are somewhat higher close to offset 
than close onset. 

In Finland, the PC2 values are generally also low, but the PC1 values are higher 
than in Svealand. 

In Norrbotten and a few other places in Norrland (for example Arjeplog, Berg), 
high PC1 values are found close to onset especially among the older speakers. Close 
to offset the PC1 values are lower. 


6.1.18 lott /o/ 


The vowel in lott (Standard Swedish [0]) shows very little variation across sites and 
generations on PC1 and considerably more variation on PC2 (see Figure 6.1, p. 87). 
The maps are displayed in Figure C.18 (p. 231). The PC1 and PC2 values are higher 
than in the corresponding long vowel (the vowel in las and [dt). 

Particularly high PC2 values are found for dialects in Harjedalen and Dalarna 
(bluish color on the maps). Low PC2 values are found among Finland-Swedish 
varieties, in Uppland and a few other scattered varieties. Many varieties in G6taland 
have higher PC2 values close to onset than close to offset. The opposite holds for 
many of the sites in Norrland, in southern Finland and on Gotland. 


6.1.19 sot /u:/ 


The vowel in the word sot is a close back vowel in Standard Swedish, [u:], and 
has dark colors for most varieties in the maps in Figure C.19 (p. 232). As for the 
other long back vowel (the vowel in lds and lat, § 6.1.17) the South Swedish area 
is distinguished from other dialects by having a diphthongized pronunciation with a 
blue color close to onset. For the sot vowel the blue southern area reaches further 
to the north than for the las /lét vowel. 

Other areas, where strong diphthongization is found for the sot vowel, are Got- 
land, Norrbotten (older speakers), Osterbotten and to somewhat lesser degree in 
Ostergétland and neighboring sites in Smaland. In these areas both PC values are 
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relatively high close to the onset, but decrease towards the offset of the vowel, in- 
dicating a closing diphthong. 

In many places in Norrland the PC2 values are somewhat higher close to offset 
than close to onset resulting in a bluish color. 


6.2. Vowel comparison 


The maps discussed in § 6.1 show that some of the vowels analyzed are relatively 
stable geographically and across generations. Other vowels show large variation 
between sites or seem to be changing. This section summarizes the data of the 19 
different vowels and compares the amount of geographic dispersion (§ 6.2.1) and the 
amount of change (§ 6.2.2) across the vowels. 

For measuring the amount of geographic variation the acoustic distances between 
all pairs of sites were calculated for each vowel. In order to measure the amount of 
change per vowel the acoustic distances between older and younger speakers at each 
site were calculated. These distances were measured as the Euclidean distance of 
the two PCs measured at nine different points within each vowel segment starting 
at 25% of the total vowel duration and ending at 75%. 

Equation 6.1 shows the Euclidean distance, where 7 ranges over the nine sampling 
points per vowel and z and y are either two different sites (§ 6.2.1) or older and 
younger speakers at one site (§ 6.2.2): 


9 
distance(x,y) = | )\((PClzei — PClyi)? + (PC2xi— PC2y:)?) (6.1) 


i=l 


6.2.1 Geographic variation 


For measuring the degree of geographic variability, the pair-wise Euclidean distances 
(Equation 6.1) between all sites in the data set were calculated for each vowel. This 
was done separately for the older and the younger speakers. The average values of the 
vowel pronunciation of each speaker group at each site were used when calculating 
the Euclidean distances. Only sites where all vowels were recorded were included to 
make sure that the comparison of the vowels would not be biased by missing data. 
In the older speaker group all vowels were recorded at 89 sites, while the number 
of sites with all vowels in the younger speaker group was 91, which lead to 3,916 
pair-wise distances between sites for older speakers and 4,095 distances for younger 
speakers.® 

Table 6.1 displays the median of the pair-wise distances per vowel in each age 
group. The vowels are listed according to descending distance for each age group 
separately. The median was chosen instead of the mean as a measure of central 
tendency, because for some words there are extreme outliers in the data set. For 
example, the few dialects that have preserved old diphthongs in words like leta, lds 


®The number of pair-wise distances between items is (n x (n — 1))/2. 
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Table 6.1. Median acoustic distances between sites per vowel, for older and younger 
speakers. The vowels are listed in order of descending distance for each age group 
separately. 


older speakers younger speakers 
vowel median vowel median 
dér [cer] 2.23 dér [oer] 1.88 
sot /u:/ 2.01 sot /u:/ 1.73 
lus [a / 1.82 lés /9:/ 1.72 
flytta /y/ 1.78 sot /:/ 1.67 
lés /o:/ 1.73 dis /i:/ 1.57 
lar [eer] 1.70 flytta /y/ 1.56 
leta /e:/ 1.69 lus /:/ 1.52 
las /lat /o:/ 1.69 typ /y:/ 1.46 
nat /ex/ 1.61 nat /ex/ 1.45 
dorr [ce] 1.57 las/lat ——/o:/ 1.42 
lat /a:/ 1.56 leta /e:/ 1.41 
dis /iz/ 1.53 lett /e/ 1.38 
lott /o/ 1.52 dorr [oe] 1.33 
typ /y:/ 1.49 lott /o/ 1.27 
lett /c/ 1.47 lar [ze] 1.26 
sot /@:/ 1.42 lat /a:/ 1.12 
disk /1/ 1.30 disk /1/ 1.11 
lass /a/ 1.21 lass /a/ 1.01 
sark [a] 1.19 sark [ze] 0.99 
mean 1.61 mean 1.41 


and dér have very large distances to the other sites on these vowels (see, for example, 
sites with yellow color close to onset in the maps of leta, Figure C.5, p. 218). 

Almost all distances in Table 6.1 are larger for the older speakers than for the 
younger (but compare sét in both lists). The mean is 1.61 for the older speakers and 
1.41 for the younger speakers. This difference is significant (Paired Samples t-test, 
t(18) = 5.06, p < 0.001), which means that there is less geographic variation in the 
pronunciation of the vowels among younger speakers than among older speakers. 

The two vowels found at the top of the table for both age groups, and, hence, 
the ones varying the most geographically in both generations of speakers, are the 
vowels in dér and sot. The median distance between sites for the vowel in dér is 
2.23 for the older speakers and 1.88 for the younger speakers. The median distances 
of the sot vowel are 2.01 for older speakers and 1.73 for younger speakers. In both 
cases, the median distance between sites is shorter for younger speakers than older 
speakers. Even if the amount of variation in these vowels seems to be decreasing, 
they still remain the most variable of the vowels in the data set. 

Also the three least variable vowels are the same for older and younger speakers: 
the vowels in disk, lass and sdrk. These vowels have median distances close to 1 for 
the younger speakers and between 1.19 and 1.30 for older speakers. 
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For both older and younger speakers long vowels vary more than short vowels. 
This is expected based on the previous knowledge of Swedish vowels described in 
§ 2.3. Regional varieties of Standard Swedish vary more in their pronunciation of 
long vowels than of short vowels (see § 2.3.2) and in the rural dialects the long vowel 
systems are more variable than the short vowel systems (see § 2.3.3). 

The vowels for which the distances between sites have decreased the most from 
the older to the younger generation are the long vowels in ldr and lat. The median 
distance among the older speakers is 1.70 for /ér and 1.56 for lat. In the younger age 
group the median distances are 1.26 (lér) and 1.12 (lat). The role of these vowels 
as markers of dialectal identity seems to be decreasing the most. 

The vowel in the word sdt, has a considerably larger median distance in the 
younger generation than in the older. In the older generation this vowel shows 
relatively little geographic variation. The median distance for the older speakers is 
1.42, and the vowel is at the fourth but last place in the descending list. In the 
younger generation, the vowel in sé¢ has a median distance of 1.67 and is found at 
place four from the top of the list, next to 16s, which has the same vowel phoneme 
in Standard Swedish. In the younger generation, three out of the four most varying 
vowels correspond to the Standard Swedish phoneme /g:/. This seems to be the 
most prominent regional marker in vowel pronunciation for younger Swedes. 


6.2.2 Degree of change 


In the previous section linguistic distances between sites were calculated in order to 
measure the geographic variability of each of the vowels. For measuring the degree of 
change of each vowel, linguistic distances between the two age groups were calculated. 
For each site, the distance between the older and younger speakers was computed 
for each vowel using Euclidean distance (Equation 6.1, p. 102). As in the previous 
section the median was considered the most appropriate measure of central tendency 
because of the skewed distribution for some of the vowels. Table 6.2 displays the 
median acoustic distance between the two age groups for each vowel in descending 
order. The table shows which vowels have changed the most on average. 

The vowels that seem to be changing the most are the Swedish long d@ and 6 
vowels. The four vowels that have the highest median difference between older and 
younger speakers all correspond to dé or 6: the vowels in lar, nat, lés and dor with 
median distances between 1.60 and 1.85. On the sixth place in the list sé¢ is found 
with an median difference of 1.52. The short @ and 6 vowels in sdérk and dérr show 
less change. The median age differences of the vowel in sdrk is 1.02 and of the vowel 
in dérr 0.97. 

Vowels that seem relatively stable, with little difference between the two age 
groups, are the long and short a in lat and lass and the short vowels in lott and 
disk. The previous section showed that short vowels vary less geographically than 
long vowels. The same holds for the variation between the two generations: the long 
vowels show more change than the short vowels. 
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Table 6.2. Median acoustic distance vowel median 

between older and younger speakers for i “tel ase: 

each vowel. The vowels are listed in des- _.. 

cending order. nat ey net 
l6s /@:/ 1.72 
dér [oer] 1.60 
lett /e/ 1.57 
st /@:/ 1.52 
leta /e:/ 1.50 
las /lat /o:/ 1.27 
dis /i:/ 1.27 
lus /s:/ 1.21 
flytta /y/ 1.19 
typ /y:/ 1.18 
sot /u:/ 1.14 
disk /1/ 1.13 
sark [a] 1.02 
lat /a:/ 1.00 
dérr [oe] 0.97 
lott /o/ 0.95 
lass /a/ 0.87 


Comparing the amount of change between the two age groups per vowel with the 
amount of geographic variation per vowel in the previous section (Table 6.1) gives 
some insight into the different types of change in the different vowels. The vowel 
in lér, which is the vowel that has the largest degree of change (median age differ- 
ence 1.85), has a moderate amount of geographic variation in the older generation 
(median distance between sites 1.70), but in the younger generation it is one of the 
least varying vowels (median distance between sites 1.26). In the case of the lar 
vowel there is a clear effect of dialect leveling, as the distances between varieties are 
decreasing. 

The 6 in dér shows a different pattern. This vowel is the geographically most 
variable vowel in both age groups (median distance between sites 2.23 for older and 
1.88 for younger speakers). Still, the vowel in ddr is also one of the vowels that 
has changed the most (median age difference 1.60). This vowel seems to hold its 
position as an important dialect marker even though its pronunciation is changing. 
Also the vowel in nat shows a quite similar amount of geographic variation in both 
age groups (old 1.61; young 1.45), even though it is changing substantially (median 
age difference 1.77). 

The vowel in sot is the second most variable vowel across sites. This is true for 
both generations according to Table 6.1. The sot vowel does not show much change 
between the generations (median age difference 1.14), but seems to be arather stable 
dialectal marker. 

The vowel in lett only varies to a moderate degree geographically in both gen- 
erations (old 1.47, young 1.38), but is one of the vowels that has changed the most 
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in addition to the @ and 6 vowels (median age difference 1.57). This suggest that 
the vowel is changing in a rather similar way across large parts of the language area. 
This is confirmed in § 6.1.6, which showed that the pronunciation of the lett vowel 
is more open among younger speakers in general than among older speakers. 


6.3. Co-occurring vowel features 


According to the data in § 6.1, some vowel features seem to have very similar geo- 
graphic and/or generational distributions. For example, the vowels in dér and lar 
show similar patterns of change between the two generations (younger speakers have 
a more open pronunciation than older speakers), while the vowels in sot and typ are 
quite stable across the generations and show similar geographic distribution (South 
Swedish diphthongization). In order to quantify and measure the strength of co- 
variation between vowel features and to identify the main distribution patterns in 
the data, a factor analysis was carried out. 


6.3.1 Factor analysis 


Factor analysis (FA) is closely related to principal component analysis (PCA, de- 
scribed in § 5.1.2). Exactly like PCA, FA reduces a large data set into a smaller 
number of loadings and scores, which enable the researcher to identify whether the 
variables can be divided into relatively independent subsets (components /factors) 
and which of the variables in a data set show similar patterns of variation. Nor- 
mally, a researcher chooses to analyze a data set either by means of PCA or by FA. 
In the present thesis, however, both methods are used. In Chapter 5 PCA was used 
to reduce a filter bank representation of speech samples to two articulatory mean- 
ingful components. In this section FA is used for analyzing geographic and social 
co-variation in the PCs of the 19 vowels. 

The main difference between FA and PCA is that FA analyzes co-variance, while 
PCA analyzes variance. This means that only variance that two or more variables 
share is analyzed in FA, while PCA analyzes all variance present in the data set. 
This makes FA a more suitable method for identifying co-occurring linguistic fea- 
tures. In a comparison of different component methods Leino & Hyvénen (2008) 
concluded that FA is the most stable component method for identifying dialect re- 
gions, providing easily interpretable results. 

Table 6.3 shows a sample of the data used as input for the FA. As in all other 
analyses in the present chapter, the data was divided into older and younger speakers 
per site, and average values of the acoustic variables were computed for the two 
speaker groups per site. FA is more stable without missing values in the analysis than 
with missing data. Hence, only speaker groups where all 19 vowels were recorded 
were used in the analysis. For the older speaker group the number of sites where all 
vowels were recorded was 89 and for the younger group 91. This gave a total of 180 
objects (data rows) in the analysis. 


Table 6.3. Sample of data for the FA. The objects of the analysis are the two age groups at each site. The objects (180 in total) are 
represented by average values on the variables of around six speakers per group (three men and three women). The analysis comprises 
76 variables: 19 words x 4 values (PC1 and PC2 onset, and PC1 and PC2 diphth.). 


objects variables 
dis disk ei typ 
onset diphth. onset diphth. me onset diphth. 

site speakers PC1 PC2 PC1l PC2 PC1 PC2 PC1 PC2 a PC1 PC2 PCl1 PC2 
Ankarsrum 

old -0.97 0.23 1.12 0.55 -1.16 -0.07 1.15 0.29 re -0.78 0.37 0.90 0.08 

young -0.74 0.38 0.87 0.24 -1.00 0.15 1.03 0.25 ty -0.63 0.44 1.17 0.20 
Anundsjé 

old -1.27 -0.08 1.16 0.31 -1.22 -0.15 0.98 0.05 “2 -1.20 O17 1.53 0.06 

young -1.21 0.05 1.18 0.20 -0.99 -0.12 0.95 0.16 an -0.96 -0.09 1.59 0.18 
Arjeplog 

old -1.09 033 1.18 0.59 -0.94 0.38 1.15 0.26 Pe -0.72 O61 1.17 0.27 

young -1.08 0.11 0.98 0.48 -1.07 0.07 1.14 0.24 i -0.97 0.30 1.12 0.20 
Oxabiick 

old -1.45 -0.04 0.76 -0.08 -1.21 -058 0.84 -0.08 ate -1.30 -0.17 1.07 0.16 

young -1.15 0.04 0.73 0.12 -0.83 -0.05 1.18 0.34 i -0.98 0.08 1.20 0.17 
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Figure 6.2. Scree plot of the FA. The panel to the left shows the initial eigenvalues 
of all 76 factors. The panel to the right shows the amount of variance explained by 
the ten factors extracted with varimax rotation. Varimax equalizes the proportion 
of variance explained by the extracted factors (compare § 5.1.4). 


The variables used in the FA were vowel quality and the degree of diphthongiz- 
ation of each of the 19 vowels. The vowel quality was measured at 25% of the total 
vowel duration. The amount of diphthongization was calculated as the difference 
between the vowel quality close to onset (at 25%) and close to offset (at 75% of the 
total vowel duration). Vowel quality was measured with two acoustic PCs, which 
lead to a total of 76 variables in the analysis: four acoustic variables (vowel quality 
at onset and degree of diphthongization, both measured with two PCs) of 19 vowels. 

The abstraction of the amount of diphthongization (PC value close to onset 
minus PC value close to offset) makes it possible to detect vowels that show similar 
types of diphthongization even if the vowel quality is different. A negative value of 
diphthongization means that the PC value increases during the vowel segment, and 
the other way around. In articulatory terms, this means that a negative value of 
diphthongization on PC1 suggests an opening diphthong, while a positive value of 
diphthongization on PC1 suggests a closing diphthong. For PC2, the relationship 
with articulation is not as direct (see § 5.2), but in rough terms, a negative value 
indicates that the pronunciation becomes fronted during the vowel segment, while 
a positive value indicates that the vowel changes from a fronted vowel towards the 
back. 

Exactly as PCA, FA can be carried out using a variance-covariance matrix or a 
correlation matrix. Using a correlation matrix means that all variables have been 
standardized, that is, transformed into z-scores before the analysis. Standardizing 
the variables makes sense if variables are measured in different scales or have different 
ranges. In the present data set the measures of vowel quality at onset and the 
degree of diphthongization are of different magnitudes. The variance of the degree 
of diphthongization is generally smaller than that of the vowel quality. Hence, the 
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FA was carried out on a correlation matrix. With this method, the variance that 
each variable contributes is 1 and only factors with an eigenvalue greater than 1 
are considered important ones, since they contribute more to the analysis than any 
single variable does. The analysis gave 18 factors with an eigenvalue greater than 1. 
However, according to Tabachnik & Fidell (2007), using all factors with an eigenvalue 
greater than 1 is likely to overestimate the number of factors to be extracted if the 
analysis comprises more than 40 variables. Since the current analysis comprises 76 
variables the scree plot was used as an additional method to determine the number 
of factors. The scree plot (Figure 6.2) does not show any sharp elbow, but after the 
tenth factor the slope does not change direction radically. Based on the scree plot 
ten factors were extracted. Varimax rotation (see § 5.1.4) was used for extraction 
and ten factors explain 60.6% of the total variance in the data. 

Table 6.4 shows a sample of the data after reduction by means of FA. Each object 
is now described by scores on the ten extracted factors instead of by values on the 
original 76 variables (sample in Table 6.3). In addition to the factor scores, the 
analysis results in a set of loadings on each factor. The loadings are displayed in 
Tables 6.5-6.12, and show the degree of correlation between the original variables and 
each factor. Following Tabachnik & Fidell (2007) loadings above 0.71 are considered 
excellent, above 0.63 very good and above 0.55 good. Loadings below 0.55 were 
not analyzed. Because the objects of the analysis are geographic locations and the 
two speaker groups, each factor clusters vowel features with similar geographic and 
generational distributions. 

Below, the factor scores of each extracted factor are visualized on maps. For 
each factor there are two maps: one for the older speakers and one for the younger 
speakers. On the maps the area of each variety is colored in a scale from green to 
magenta (see Appendix B, § B.3). Green means that the score is low, while magenta 
indicates a high score. Hence, objects with similar scores have similar colors. Similar 
scores indicate similar pronunciations of the vowels with high loadings on the factor 
in question. 

In contrast to the maps discussed in § 6.1, these maps cannot be interpreted 
directly in terms of vowel quality. The maps visualizing the factor scores display 
solely distribution patterns found in a number of variables. The factor loadings tell 
which of the vowels are connected to these distribution patterns. Differences in vowel 
quality among the objects can to some extent be inferred by interpreting the scores 
and loadings together. Since factor loadings can be interpreted as correlations, a 
high positive loading indicates that objects with high scores have higher values on 
the variable in question than objects with low scores, while high negative loadings 
suggest the opposite. 


Table 6.4. Result of data reduction by FA. The original 76 variables (that is, four acoustic variables of 19 vowels) have been reduced 


to scores_on ten factors. 


site speakers factorl factor2 factor3 factor4 factord5 factor6 factor? factor8 factor9 factor10 
Ankarsrum 
old 0.44 0.70 -0.65 -0.22 0.65 1.14 -1.94 0.34 0.33 0.62 
young 0.11 1.34 0.72 -0.96 1.02 0.74 -1.44 -0.86 1.39 0.22 
Anundsjé 
old 0.77 -0.50 -1.12 0.42 0.40 -0.17 1.08 0.35 1.16 -0.54 
young 1.16 -0.53 -0.26 0.07 -0.16 0.34 1.00 0.26 -0.27 -0.01 
Arjeplog 
old 0.52 0.75 -0.46 -0.52 1.72 1.30 2.07 0.75 0.30 -0.77 
young 0.61 -0.15 0.37 -0.43 0.29 0.83 2.06 -0.39 0.20 -0.78 
Oxabick old -0.30 -0.81 -1.37 1.00 -1.24 0.39 -0.35 0.64 -1.29 -0.02 
young 0.52 -0.42 -0.46 0.08 -1.56 1.46 -0.50 1.62 -0.25 -0.10 


OTT 
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Table 6.5. High loadings on the first vowel measure pc _ loading 
factor (13.5% variance explained). aig fey onget ped “0-817 


flytta /y/ onset pc2 0.813 
disk —/1/ onset pc2 0.812 
nat = /ex/ onset pc2 0.801 
lett —/e/ onset pc2 0.794 
lés /@:/ onset pc2 0.782 
typ /y:/ onset pe2 0.765 
lar zer] onset pc2 0.756 
lus /:/ onset pc2 0.747 
sot /@:/ onset pc2 0.727 
sark [a] onset pc2 0.725 
leta = /e:/ onset pc2 0.723 
lass = /a/ onset pc2 0.681 
dérr [ce] onset pc2 0.635 
lat /a:/ onset pc2 0.602 
dér oe:] onset pc2 0.561 


6.3.2 Factor 1 


The first factor explains 13.5% of the total variance in the data. A number of front 
vowels show high positive loadings (Table 6.5). For all of these vowels, the high 
loadings concern the onset value of PC2. Figure 6.3 shows the scores of this factor. 
The more magenta the coloring on the map, the higher the score. 

The positive loadings indicate that objects with high scores have higher PC2 
values on the vowels involved than objects with low scores. The maps show that 
many Svealand varieties and varieties along the Finnish south coast have low scores 
on this factor (green), while most of the dialects in the rest of the language area 
have higher scores (magenta). The scores show a relatively stable pattern across 
the two generations, which indicates that the variables are not involved in any large 
ongoing change. Lower PC2 values in Svealand and on the Finnish south coast were 
also noted for most of the front vowels in § 6.1. 

As shown in Figure 5.13 (p. 77) the correlation between PC2 and F2 is weaker 
exactly for the highest F2 values, that is for front vowels. The low PC2 values for 
front vowels in Svealand and in the south of Finland are therefore not necessarily 
an effect of retracted pronunciation and lower F2. It is in fact quite unlikely that 
all front vowels would show a non-maximal F2. That would mean that the whole 
vowel space would not be used and that there would not be a maximal acoustic 
distinction between front and back vowels. This would be against the theory of 
maximal dispersion (Lindblom, 1986). 

It is more likely that the low PC2 values in the green area in Figure 6.3 are a result 
of lower intensity at the highest frequency area in general than of low frequencies 
of F2. A possible explanation for differences in the intensity at higher frequencies 
would be differences in voice quality. Less vocal effort and the use of breathy voice 
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Figure 6.3. Scores of the first factor. Green = low score, magenta = high score. 
Vowels with high loadings are displayed in Table 6.5. The gradient colors most likely 
denote differences in spectral slope, which might suggest voice quality differences. 


are factors that increase the spectral tilt. When breathy voice is used the vocal folds 
do not close simultaneously, but they close first at front and never close completely 
at. the back. Hence higher harmonics in the spectrum are attenuated (Klatt & 
Klatt, 1990, 822-823). The result is a steep spectral slope and little energy in the 
highest part of the spectrum. Varying the degree of physical effort used for speech 
production has a similar effect on the spectral tilt (Sluijter & Van Heuven, 1996). 
In loud speech the glottal pulse is asymmetrical (the closing phase is faster) which 
increases the intensity of higher harmonics. In softer speech there is less intensity 
at higher frequencies and the spectral tilt is steeper. 

Elert (1983) discusses regional variation in voice quality in Swedish. Based on his 
own perceived observations Elert mentions that creaky voice is used in Smaland while 
breathy voice can be found land inwards in Norrland. The fundamental frequency 
is lowest in the north and increases towards the south and reaches a maximum in 
Vastergétland and Ostergétland. The area surrounding Stockholm is according to 
Elert characterized by nasalization. Nasalization of vowels leads to anti-formants in 
the spectrum as a result of acoustic coupling (Rietveld & Van Heuven, 2009). Some 
frequencies are reinforced while others are attenuated by nasalization, and the effect 
on the spectrum varies across vowels. 
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The conclusion that the first factor of this study is connected to voice quality 
differences is only speculative and has not been confirmed by other analyses of the 
data. Regional differences in voice quality should be studied further with instru- 
mental methods. Possible explanations for the low PC2 values in Svealand and 
Finland are more use of breathy voice or the use of less vocal effort (that is, softer 
speech) than in Gétaland and Norrland, or nasal voice quality. 


6.3.3 Factor 2 


The second factor correlates with the long close vowels in typ, dis, sot and lus 
(Table 6.6). This concerns the onset value and degree of diphthongization of PC1 of 
the vowels in typ, dis and lus and PC2 of the sot vowel. The factor explains 9.1% 
of the total variance. 

The maps in Figure 6.4 show that the South Swedish varieties clearly differ from 
the rest on this factor. These varieties have a clear magenta color, which means high 
scores and indicates higher PC1 values at the onset in the vowels in typ, dis and lus 
and higher PC2 values in the vowel in sot than in the rest of the language area. This 
suggests a more open pronunciation at onset in typ, dis and lus and a more fronted 
pronunciation at onset in sot. 

The loadings related to the degree of diphthongization of these vowels are also 
positive, which suggests higher diphthongization values for the South Swedish vari- 
eties than for the other varieties. As explained in § 6.3.1, positive diphthongization 
values indicate decreasing PC scores during the vowel segment, while negative diph- 
thongization values indicate increasing PC scores. A diphthongization value close to 
zero means a monophthong-like pronunciation. Because the diphthongization values 
can be both positive and negative, the range of the diphthongization values of the 
objects has to be taken into account when interpreting factor scores connected to 
diphthongization. The box plots in Figure 6.5 show the dispersion of the variables 
measuring the amount of diphthongization with high loadings on the second factor. 
The central box spans values around or slightly above zero for all four variables, 
which means that most varieties have a monophthong-like pronunciation. All of the 
four vowels show a number of outliers with high positive values close to 1, and for 
the sot vowel even higher. Since the South Swedish varieties have high positive 


Table 6.6. High loadings on the second vowel measure pc _ loading 
factor (9.1% variance explained). (p. do Gente ope 0849 


typ /y:/ onset pel 0.843 
dis /i:/ diphth pcl 0.782 
sot /u:/ onset pc2 0.767 
dis /i:/ onset pel 0.755 
sot /u:/—s diphth = pc2_—_—0.706 
lus /x/  diphth pcl 0.683 
lus /a:/ onset pcl 0.583 
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Figure 6.4. Scores of the second factor, which is an indicator of South Swedish 
diphthongization. Green = low score, magenta = high score. Vowels with high 
loadings are displayed in Table 6.6. 


scores on the second factor, the outliers are likely to be the South Swedish variet- 
ies. Positive values of diphthongization on PC1 indicate closing diphthongs, while 
positive values of diphthongization of PC2 indicate that the vowel changes from a 
more fronted vowel towards the back. The conclusion can be drawn that the vowels 
in typ, dis and lus are all closing diphthongs in South Swedish, while the vowel in 
sot moves from a fronted position to the back. This corresponds well to the South 
Swedish diphthongization described in § 2.3.2.1. Long vowels are pronounced as 
rising diphthongs that reach the Standard Swedish vowel quality at the end. Front 
vowels are closing, while back vowels start as central unrounded vowels and move 
backwards to their Standard Swedish vowel quality. According to Elert (2000, 39), 
the close vowels are the most strongly diphthongized vowels in South Swedish. The 
area with the magenta color in Figure 6.4 corresponds well to the area of South 
Swedish diphthongization described by Elert (2000, 39-40). 

Figure 6.6 shows an example of the dynamic change in PC scores during the 
vowel segments in South Swedish diphthongization. The average scores of the older 
speakers at the site Norra Rérum at all nine sampling points are displayed for the 
four vowels. The PC traces show a very regular diphthongization across the vowels. 
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Figure 6.5. Box plots that show the dispersion of the diphthongization values of 
vowels with high loadings on the second factor: typ (PC1), dis (PC1), lus (PC1), 
sot (PC2). In order to interpret the high scores of South Swedish varieties on the 
second factor, it was important to know how high the highest values on the variables 
involved were. The box plots show that the largest sample values are around 1 for 
all variables, and that there are a number of outliers with high values. 
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Figure 6.6. As an example of South Swedish diphthongization, the plot shows 
the average scores at all nine sampling points of the older speakers at one site, 
Norra R6rum. The vowels with high loadings on the second factor are displayed: 
typ (PC1), dis (PC1), lus (PC1), sot (PC2). The traces show the regularity of the 
South Swedish diphthongization across the four vowels. 
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In the younger generation there are fewer sites with very high scores on the 
second factor than in the older generation, but magenta-hued colors are found at 
a number of sites in Gétaland and also in Narke. For these sites a relatively open 
pronunciation of the vowels involved was found in § 6.1. 


6.3.4 Factor 3 


The third extracted factor explains 8.4% of the variance in the data. Vowel features 
with high loadings on the third factor are displayed in Table 6.7. Vowels related 
to this factor are the front mid vowels in the words lett, dér, lds, sdt, lar and 
leta. Figure 6.7, which displays the factor scores of the third factor, shows a large 
difference between older and younger speakers. The map of the older speakers is 
dominated by green, which means low factor scores, while the map of the younger 
speakers is mostly magenta with only a few green spots. 

The variation on the third factor concerns the onset value of PC1 of all of the in- 
volved vowels. The loadings are all positive, which means that a high score (magenta 
on the maps) suggests a high PC1 value, while low scores (green) indicate low PC1 
values. Interpreted in articulatory terms magenta represents a more open pronun- 
ciation of the vowels involved than green. Hence, the pronunciation of these vowels 
is generally more open among younger speakers than among older speakers. 

The fact that younger speakers in general pronounce these vowels more openly 
than older speakers, was also noted separately for each vowel in § 6.1. However, the 
different vowels involved show slightly different patterns. 

The pronunciation of the vowel in lett seems to have become more open in the 
younger generation in general. The vowel of leta also has a more open pronunciation 
among younger speakers than among older speakers, but in some areas the vowel is 
a closing diphthong, which means that the PC1 values are high close to onset but 
decrease towards offset. The closing diphthong is found in South Swedish varieties, 
on Gotland and among older speakers in Norrbotten. 

The vowels in dér and lar, that is, the pre-/r/ variants of /@:/ and /e:/, have an 
open pronunciation in the older generation in an eastern area (particularly Uppland, 
Gotland and Finland) and in the South Swedish varieties. In dér also the dialects 
in Norrbotten have an open pronunciation close to onset, but the vowel is strongly 
diphthongized. In the younger generation the open pronunciation of the vowels in 
dér and ldr has spread all over the language area. 


Table 6.7. High loadings on the third vowel measure pc _ loading 
factor (8.4% variance explained). 


lett = /e/ onset pcl 0.821 
dér _—_ [oe:] onset pcl 0.769 
lés /@: onset pel 0.756 
sot /@: onset pcl 0.739 
lar [eer] onset pel 0.720 
leta —/e:/ onset pcl 0.688 
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Figure 6.7. Scores of the third factor. Green = low score, magenta = high score. 
Vowels with high loadings are displayed in Table 6.7. The large difference between 
older and younger speakers has to do with the lowering of front mid-vowels by 
younger speakers. 


For the vowels in lés and sdét the pattern looks somewhat different. The sét 
vowel has a quite close pronunciations among most of the older speakers. The open 
pronunciation in the younger generation has not spread as much as for the vowels 
in dor and lar, but the most open pronunciations are found in central Sweden and 
on the west coast. The /és vowel shows a similar distribution as the sdét vowel for 
younger speakers, but among older speakers the [6s vowel is diphthongized at many 
sites, especially in Norrland and on Gotland. 

In spite of the somewhat different geographic distributions for the different vowels 
involved in the third factor, the factor has detected a general trend and collected 
vowels with higher PC1 values for younger speakers than for older speakers. 


6.3.5 Factor 4 


On the fourth factor only one vowel has got high loadings: the vowel in dérr. The 
variation involves the diphthongization of PC1 and somewhat less the vowel quality 
at onset (Table 6.8). 5.2% of the variance in the data is explained by this factor. The 
pattern is generationally quite stable. The geographic areas that are identified by this 
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Table 6.8. High loadings on the fourth vowel measure pc _ loading 
factor (5.2% variance explained). dae hel diphth pel -0.773 


dorr [ce] onset pel -0.572 


Figure 6.8. Scores of the fourth factor, which is connected to the pronunciation of 
the vowel in dérr (see Table 6.8). Green = low score, magenta = high score. 


factor are the South Swedish area and the province of Jamtland (Figure 6.8). The 
loadings are negative, which suggests that variants with high scores (magenta) have 
a lower diphthongization value. In § 6.1.14 negative diphthongization values on PC1 
(that is, an opening diphthong) were found in Jémtland and in the South Swedish 
area, while most varieties of Swedish do not have a diphthongized pronunciation. In 
the South Swedish area the vowel starts and ends more open than in Jaémtland. 

Because only one vowel is involved in the fourth factor, the factor does not 
contribute much to the detection of co-occurring features. What the factor tells us, 
is that varieties with a diphthongized pronunciation of the vowel in dérr also have 
low PC1 values close to onset. 
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Table 6.9. High loadings on the fifth vowel measure pc _ loading 
factor (5.2% variance explained). ca Ta hte: pel. O613 


sot /u/  diphth ~=pcl 0.758 
lus /xu/  diphth pc2 0.571 


old young 


Figure 6.9. Scores of the fifth factor. Green = low score, magenta = high score. 
The underlying factor is secondary diphthongization of the vowels in sot and lus on 
Gotland and in Norrbotten (Table 6.9). 


6.3.6 Factor 5 


The fifth factor identifies two very clear dialect groups: the dialects of Gotland and 
the dialects in Norrbotten (Figure 6.9). However, on Gotland both older and younger 
speakers are assigned high scores (magenta), while in Norrbotten the younger speak- 
ers have low scores like the rest of the language area. The variation involves the onset 
value and diphthongization of PC1 of the vowel in sot and to some extent the diph- 
thongization of the vowel in lus (Table 6.9). The fifth factor explains 5.2% of the 
variance in the data. 

Diphthongization of these two vowels on Gotland and among older speakers in 
Norrbotten was also detected in §§ 6.1.7 and 6.1.19. The type of diphthongization 
of the vowel in lus was found to vary across the dialects in Norrbotten. 
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On Gotland, diphthongization of these vowels (so-called secondary diphthongiz- 
ation, see § 2.3.3) is part not only of the local dialects, but also of standard-like 
speech (see § 2.3.2). On the basis of only this factor no conclusions can, thus, be 
drawn about whether the Gotlandic speakers speak a traditional dialect or a more 
leveled standard-like variety. 

In Norrbotten diphthongs are characterizing for the traditional dialects. How- 
ever, in standard-like speech in this area diphthongs are lacking (Johansson, 1982; 
Elert, 1994). According to Figure 6.9, most of the younger speakers in Norrbotten 
do not use diphthongs. This feature seems to be leveled and the pronunciation of 
the vowels is closer to Standard Swedish. 


6.3.7 Factor 6 


The amount of variance explained by the sixth factor is 4.9%. The vowels with 
high loadings are the vowels in the words lott, lat and lass (Table 6.10). These 
vowels have changed very little between the older and younger generation according 
to the results in the previous section (Table 6.2, p. 105). Also in Figure 6.10, which 
displays the scores of the sixth factor, the geographic variation pattern looks very 


old young 


Figure 6.10. The sixth factor identified small, geographically conditioned differ- 
ences in the pronunciation of the vowels in lott, lat and lass (see Table 6.10). Green 
= low score, magenta = high score. 
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similar in the older and younger generation. All of these vowels also show a relatively 
small amount geographic variation (Table 6.1, p. 103). Apparently, the sixth factor 
catches relatively subtle differences. 

The variation concerns the amount of diphthongization of PC2. For the vowel 
in lott the loading is excellent, while the vowels of lat and lass have lower loadings. 
The loadings are positive, which indicates that variants with higher scores (magenta) 
have higher diphthongization value. 

Mainly South Swedish varieties and a number of varieties in Norrland are assigned 
low scores (green) on this factor. Also the older speakers of some sites in Svealand 
and on Gotland have low scores, as well as some of the Finland-Swedish varieties. 
The highest scores are found in Gétaland. Differences in the diphthongization of the 
vowel in lott between varieties Gétaland and Norrland were also noted in § 6.1.18. 


Table 6.10. High loadings on the sixth vowel measure pc _ loading 
factor (4.9% variance explained). lott. /o/ diphth pe2 0.732 


lat /a:/ ss diphth = pc2_— (0.635 
lass /a/ diphth pc2 0.568 


6.3.8 Factor 7 


The amount of diphthongization on PC1 of the vowel in the words lés/lét, ndt and 
lér correlates positively with the seventh factor (Table 6.11), which explains 4.7% 
of the variance in the data. 

The amount of diphthongization of PC1 of lés/lét and ndt both have a mean 
value close to 0 and a standard deviation of 0.3. Low scores (green), hence, indic- 
ate negative diphthongization values, while high scores (magenta) suggest positive 
diphthongization values. The vowel in Idr has less variation in the degree of diph- 
thongization, with a mean of —0.1 and a standard deviation of 0.2. 

Figure 6.11 shows that the dialects on Gotland have the lowest scores in the 
older generation, together with some Central Swedish varieties. The low scores 
suggest opening diphthongs in lés/lét and nat. This is a part of the Gotlandic 
diphthongization described in § 2.3.2.3. 

Eklund & Traunmiiller (1997) noted a substantial diphthongization of mid vowels 
in Standard Swedish, which could fit with the green Central Swedish area in the 
maps. 


Table 6.11. High loadings on the sev- vowel measure pc _ loading 
enth factor (4.7% variance explained). las/lat /or/  diphth pel 0.797 


nat /e:x/  diphth pcl 0.751 
lar [xx] diphth_=spel_~—<0.571 
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Figure 6.11. Scores of the seventh factor, which is primarily determined by the 
amount of diphthongization of the vowels in lés/ldt and ndt (see Table 6.11). Green 
= low score, magenta = high score. 


High scores (magenta) suggest closing diphthongs in lds /lét and nat, while inter- 
mediate colors indicate a more monophthong-like pronunciation. Conclusions about 
the pronunciation in ldér are harder to draw because of the lower correlation. 

In the younger generation the geographic distribution on the seventh factor looks 
very similar to the older generation, but there are less extreme scores overall. 


6.3.9 Factor 8 


The eighth factor explains 3.7% of the total variance. As Figure 6.12 shows, there 
is one extreme outlier on the eighth factor. That is the older speakers in Overkalix 
(Norrbotten). Two vowels have high loadings: the vowels in lass and lat (Table 6.12). 
The pronunciation of these vowels in Overkalix is discussed in § 6.1. 

Apart from the one outlier, low scores are found for example in the South Swedish 
area and in Uppland, while Vastergotland has very high scores. 
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Table 6.12. High loadings on the vowel measure pc _ loading 
eighth factor (3.7% variance explained). ass, fal one: pel. U7i4 


lat /a:/ onset pcl 0.652 


old young 


Figure 6.12. Scores of the eighth factor, connected to the pronunciation of the 
vowels in lass and lat. Green = low score, magenta = high score. 


6.3.10 Factor 9 


On the ninth factor not a single variable attained a loading that could be considered 
good. The highest loading was the one of the onset value of PC2 of the vowel of 
the word dér, which was 0.504. Nonetheless, 3.1% of the variance in the data is 
explained by this factor, and some geographic areas can be identified in the maps in 
Figure 6.13. 

In the older generation the dialects of Jamtland have very high scores (magenta), 
while east Svealand is assigned the lowest scores together with sites on the Finnish 
south coast. In the younger generation, the varieties are not assigned as extreme 
scores as in the older generation. The east Svealand dialects are more similar to the 
surrounding area in the younger generation than in the older generation. In Finland, 
low scores are maintained in the younger generation. The dialects of Norrbotten are 
very different from each other in the older generation, but the younger speakers in 
this area are assigned very similar scores. 
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Figure 6.13. Scores of the ninth factor. Green = low score, magenta = high score. 
No variables obtained loadings > 0.55 


6.3.11 Factor 10 


On the tenth factor only one variable has a good correlation: the amount of diph- 
thongization on PC2 of the vowel in Iés (0.577). The amount of variance explained 
is 2.9%. In Figure 6.14, the most extreme low scores (green) on this factor are found 
on Gotland and the most extreme high scores (magenta) are found in Kalix and 
Nederlula (Norrland). Dialects in both these areas are known to have preserved 
Proto-Nordic diphthongs, like the one in Ids (see § 2.3.3). Still, the pronunciation 
in these two areas is different according to the factor scores. The diphthongization 
on PC2 in Kalix and Nederlulea is mentioned in § 6.1.12. For the Gotlandic vari- 
eties, no large difference between the vowels in sdét and Iés was found in § 6.1, but 
diphthongization of Standard Swedish /g:/ was detected on Gotland. 

In Gotland the scores on the tenth factor are similar in the older and younger 
generation. In Norrbotten there has been a dramatic change between the two gen- 
erations. 
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Figure 6.14. Scores of the tenth factor. Green = low score, magenta = high score. 
Vowel with high loading: Jés. 


6.4 Summary 


In this chapter the geographic variation and age related variation in the 19 different 
vowels in the data set were analyzed. A detailed view of the variation in each vowel 
was given in § 6.1, while the amount of geographic and age related variation across 
the 19 vowels was compared quantitatively in § 6.2. In a factor analysis (FA, § 6.3), 
the variance in the data set was reduced into ten underlying factors. The three 
different methods applied in this chapter supplement each other. The quantitative 
comparison in § 6.2 and the FA in § 6.3 detect vowels and variables which show 
similar kinds of variation and, hence, summarize the data. The analysis per vowel in 
§ 6.1 gives a detailed account for the variation beyond the comprehensive patterns 
detected by the two other analyses. 

The comparison of the amount of variation on the 19 different vowels, in § 6.2, 
showed that long vowels vary more than short vowels both across sites and across the 
two age groups. The two geographically most varying vowels in both age groups were 
the vowels in dér (Standard Swedish [oe:]) and sot /u:/, while the geographically 
least varying vowels were the vowels in disk /1/, lass /a/ and sdrk [ee]. Almost all 
vowels vary geographically less in the younger speaker group than in the older. The 
average linguistic distance between dialects is significantly shorter in the younger 
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speaker group than in the older speaker group. The linguistic distances becoming 
shorter indicates an ongoing dialect leveling in the Swedish language area. The 
vowels with the most decrease in geographic variation were the vowels in lar [e:] 
and lat /a:/. Only the vowel in sét /¢:/ showed considerably more geographic 
variation among younger speakers than in the older age group. 

Some of the factors in the FA identified distinct dialect groups in the data, while 
others showed a more continuous distribution of the vowel features. Some dialect 
areas were identified by a number of the extracted factors. For example, the first 
and the ninth factor showed a relationship between dialects in Svealand and the 
dialects along the Finnish south coast. The dialects on Gotland were shown by 
several factors to share some vowel features that distinguish them from the other 
Swedish dialects (factors three, five, seven and ten). The second factor showed that 
the most distinguishing feature of South Swedish is the diphthongization of long close 
vowels. A number of factors also identified distinguishing features for the dialects in 
Norrland. Especially in the provinces Jamtland (factors three, four and nine) and 
Norrbotten (factors three, five and ten) divergent dialects are found among the older 
speakers. 

Factors seven and nine showed similar geographical distribution patterns in the 
older and younger generation of speakers, but among the younger speakers the factor 
scores were less extreme. These factors might either show that a lower number of 
the younger speakers than of the older use the dialectal features (that is, some of 
the younger speakers at a site use the standard variant and others the local variant 
which gives less extreme average values than for older speakers), or that the dialectal 
markers are maintained by younger speakers but the acoustic distances between the 
variants are becoming smaller. 

Some factors failed to bundle together features of several vowels, but only cor- 
related highly with one of the vowels in the data set. For example, factor four only 
correlated highly with the vowel in the word dérr and showed that the vowel quality 
close to the onset of this vowel is connected to the amount of diphthongization. 

The eighth factor identified one outlier in the data set. It showed the older 
speakers in Overkalix have a divergent pronunciation of the vowels in lass and lat. 

The sixth factor showed that the vowels in lott, lat and lass co-occur when it 
comes to the amount of diphthongization. However, the analyses in §§ 6.1 and 6.2 
showed that the total amount of variation in these vowels is very small. 

The Swedish front mid vowels showed most variation across the two age groups. 
The vowels in the words ldr [ze:], ndt /e:/, 16s /o:/, dér [oer], lett /e/, sét /o:/ and 
leta /e:/ were found to have the largest average distances between older and younger 
speakers in § 6.2. All of these vowels, except for the vowel in nat, were also found to 
co-occur by the third factor of the FA. This suggests that the change in the vowel 
in nat has a different distribution than that in the other vowels. The FA showed 
that what these vowels have in common is that the PC1 values are higher among 
younger speakers than among older speakers, which means that the pronunciation 
is becoming more open. The results in § 6.1 showed that the vowels in ldr, nét and 
lett are becoming more open in almost the whole Swedish language area. For the 
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vowel in dér a more close pronunciation is preserved only in western parts, while the 
opening of the vowel in /és and sét is restricted to a smaller area in central Sweden 
and on the west coast. 

The change towards a more open pronunciation of the vowels in ldr and dér, 
means a change in the direction of the standard language and varieties spoken around 
Stockholm. In the case of nét, however, the varieties around Stockholm (together 
with other eastern varieties) have the most close pronunciation in the older gener- 
ation and the varieties around Stockholm are the ones that are changing the most. 
The close pronunciation of the nat vowel in Stockholm (due to a merger of /e:/ and 
/e:/) has never been accepted as Standard Swedish (Elert, 2000, 47). 

The spreading of a more open pronunciation of the vowels in sé¢ and Ids (Stand- 
ard Swedish /g:/) is described by Nordberg (1975) as a change from below (see 
§ 6.1.11). The open variant has previously been stigmatized and used mainly by 
speakers in the lower social classes. In Nordberg’s study, however, open variants 
were more common among younger speakers than among older speakers, and were 
used also by higher social class youth. In an FA of a number of dialectal features 
in the spoken language of Eskilstuna, Hammermo (1989) showed that there was a 
co-variation between /g:/ and /e:/. The open pronunciation of these vowels was in- 
terpreted by Hammermo as a marker of local identification with Eskilstuna. Other 
features that co-varied to some extent with these vowels were the pronunciation 
of /a:/ and the degree of Central Swedish diphthongization (§ 2.3.2.2). In Ham- 
mermo’s data young girls used more open variants of /¢:/ and /e:/ than young boys. 
Aniansson (1996) analyzed the same data as Hammermo (1989) and made the inter- 
pretation that the open variants were not seen as local markers by the girls, but had 
become the more prestigious pronunciation. A similar trend was shown by Kotsinas 
(1994), who described the diffusion of the previously stigmatized open pronunciation 
of these vowels to Stockholm, where girls, not least in the upper-class, exceeded boys 
in the use of open variants. 

The impact of the lowering of /g:/ and /e:/ on a large part of the language 
area is shown very clearly in the analyses in this chapter. Also for the speakers 
who represent Standard Swedish this ongoing change could be noted. The formerly 
stigmatized variants, hence, seem to be becoming the preferred variants. Milroy, 
Milroy, & Hartley (1994) described a case where female speakers starting to use a 
previously stigmatized feature (glottalization) in British English led to the feature 
gaining prestige. They suggested that women create prestige instead of being the 
ones adapting the most to the standard language, which has been postulated in 
many other studies. This hypothesis fits very well with the Swedish data. In the 
studies by Hammermo (1989), Kotsinas (1994) and Aniansson (1996), young girls 
were the ones favoring previously stigmatized open pronunciations of /g:/ and /e:/. 
Comparison of older and younger speakers in the present study shows the diffusion 
of the open pronunciation to a large number of rural Swedish sites, and even to 
speakers considered representatives of Standard Swedish. 


Chapter 7 
Aggregate analysis 


The previous chapter showed that groups can be identified among Swedish dialects 
based on some specific vowel features. Contrary to analyzing separate features, 
the dialectometric research tradition has emphasized the aggregate analysis which 
shows how dialects relate to each other when all available variables are considered 
simultaneously (Nerbonne, 2009). Common aggregating methods are cluster analysis 
and multidimensional scaling (MDS). Both methods use a distance matrix with the 
pairwise distances between all objects as input. MDS is a method for reducing 
complex distance data to interpretable low-dimensional representations, while cluster 
analysis produces partitions of the data. MDS is suitable for visualizing dialect 
continua, while cluster analysis detects dialect groups. 

Cluster analysis is a relatively unstable method where small differences in the 
input data can result in substantially different outputs (Jain & Dubes, 1988; Ner- 
bonne, Kleiweg, Heeringa, & Manni, 2008). In an analysis of Bulgarian dialects 
Proki¢é & Nerbonne (2008) found that only clusters that could be visually identified 
in a two-dimensional MDS plot were identified with high confidence by a number of 
clustering algorithms. Legendre & Legendre (1998, 482) argue that when only two 
or three dimensions are considered in MDS the analysis might fail to identify par- 
titions that are distinguished in higher dimensions. However, when applying MDS 
to dialect. data three dimensions generally explain at least around 90% of the total 
variance in the data (Heeringa, 2004; Proki¢ & Nerbonne, 2008). Hence, higher 
dimensions are unlikely to play any role in identifying group structure. 

Tibshirani, Walter, & Hastie (2001) proposed a method called Gap statistic for 
estimating the number of groups in a data set. The Gap statistic can be used for 
estimating the number of significant clusters produced by any clustering algorithm. 
Lundberg (2005) used the Gap statistic to estimate the number of significant clusters 
when grouping the Swedish dialects based on acoustic analysis of the vowel in the 
word lat (/a:/ in Standard Swedish) and found three significant clusters. 
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The Gap statistic was applied to the present data set using the CLUTO! software. 
The analysis showed that there are no well separated clusters in the data set. This 
result suggests that the Swedish dialects form a true continuum when it comes 
to an aggregate analysis of vowel pronunciation. The absence of clearly separable 
dialect groups is in agreement with previous research. The dialects in the Swedish 
language area are said to form a true continuum without abrupt borders (see § 2.2.1). 
Clustering methods could be applied to the data, but they are likely to produce 
unstable results, since any sharp division into subsets is not in agreement with the 
structure of the data. In this chapter, an aggregate analysis of vowel pronunciation 
in Swedish dialects is proposed by means of MDS. The MDS plots (for example, 
Figure 7.1) confirm the view that the Swedish language area is a genuine dialect 
continuum. 

In § 7.1 MDS is described, and § 7.2 gives the results of a number of MDS 
analyses. The results of the aggregate analyses are summarized in § 7.3. 


7.1 Multidimensional scaling 


Multidimensional scaling (MDS) reduces complex distance data to low-dimensional 
representations and allows visualization of the distances in a low-dimensional space. 
Like principal component analysis (PCA, § 5.1.2) and factor analysis (FA, § 6.3.1) 
MDS is a dimensionality reduction technique. One difference between PCA and 
FA, on one hand, and MDS, on the other, is that PCA and FA analyze the full data 
matrix where every object is described by a number of variables, while MDS analyzes 
the distances/similarities between objects based on some chosen distance/similarity 
measure. In MDS the aim is to represent the objects in a small number of dimensions, 
while the exact preservation of original distances is less important than in PCA or 
FA (Legendre & Legendre, 1998, 444). In MDS priority is given to preserving the 
ordering of the objects instead of the exact distances between objects. Because 
of this, MDS allows us to investigate the relationships between dialects in fewer 
dimensions than FA. MDS is normally used to scale to two or three dimensions, 
since more dimensions are difficult to visualize simultaneously. 

The results of MDS presented below show that MDS scaled to three dimensions 
explains more than 95% of the variance in the present data set. This can be compared 
to the FA in the previous chapter, where ten factors explained only 60.6% of the total 
variance in the data. However, FA explains 60.6% of the variance in the original 
data, while MDS explains 95% of the variance in the distance matrix. Converting 
the original data to pairwise aggregate distances as such reduces the amount of 
information in the data. The FA showed how Swedish dialects relate to each other 
when some specific features are considered. By reducing the data to a smaller 
number of dimensions than FA, MDS gives an aggregate analysis, that is, it allows 


'CLUTO: software for clustering high-dimensional data sets. By the De- 
partment of Computer Science and Engineering, University of Minnesota. 
<http://glaros.dtc.umn.edu/gkhome/views/cluto> 
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us to see how the dialects relate to each other when all variables are taken into 
account simultaneously. 

In MDS, original distances between objects are approximated in a low-dimen- 
sional space by an iterative algorithm (Jain & Dubes, 1988). A number of algorithms 
for MDS have been proposed. In a study of the linguistic distances between varieties 
of Dutch Heeringa (2004) measured the fitness of three different MDS procedures 
by correlating the original distances with the Euclidean MDS coordinate-based dis- 
tances, and found that Kruskal’s non-metric MDS gave the best results. In this 
chapter Kruskal’s non-metric MDS, as implemented in the RuG/L04? software, is 
being used. 

For the MDS the distances between the varieties were calculated as the average 
distance of the 19 vowels in the data set. First, the distance for each vowel between 
two varieties was calculated as the Euclidean distance of the acoustic variables of 
vowel quality (Equation 6.1, p. 102), that is, two principal components (PCs), meas- 
ured at nine different points within each vowel segment, starting at 25% of the total 
vowel duration and ending at 75%. Subsequently, the average of the vowel distances 
was calculated. At some locations not all of the 19 vowels were elicited and those 
sites had to be left out from the FA (§ 6.3). However, the average vowel distance 
between two objects can be calculated also for a fewer number of vowels without 
biasing the ordering of the data. Therefore, sites and speaker groups that were left 
out from the FA could be included in the MDS. 

Results from MDS are often visualized in two- or three-dimensional coordinate 
systems. In this representation similar items are found close to each other, while 
dissimilar items are far apart. Nerbonne, Heeringa, & Kleiweg (1999) proposed a 
method for displaying the results of MDS of dialect data on maps. Each of the 
three basic colors in the RGB color model is used to represent one dimension of 
the MDS (see Appendix B, § B.1). Hence, each position in the three-dimensional 
space will be represented by a unique color (Figure B.1, p. 211). On the maps, the 
area of each variety is colored with the color corresponding to the position in the 
three-dimensional space. In this way, the positions in the MDS space are connected 
to the geographic locations, and the colors in the maps show how similar or distant 
dialects are to each other in a three-dimensional linguistic space. 


7.2. Dialect continuum 


In order to explore the Swedish dialect continuum, MDS was applied to three differ- 
ent divisions of the data. In § 7.2.1 the geographic variation is described by averaging 
over all speakers per site. In § 7.2.2 the data is split into older and younger speakers 
per site. Since this division of the data is the same as the one that was used in the 
FA in § 6.3, a comparison with the results of the FA allowed for an interpretation 


?RuG/LO04: software for dialectometrics and cartography. By P. Kleiweg, University of Gro- 
ningen. <http://www.let.rug.nl/kleiweg/L04/> 
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of the MDS dimensions, which is reported in § 7.2.3. In § 7.2.4 a further division 
according to gender is made. 

Group means were calculated for PC1 and PC2 respectively for each of the 19 
vowels at the nine sampling points (see the lowest part of Figure 5.12 on p. 72). 
Subsequently the Euclidean distances between varieties were calculated based on 
the group means, and the resulting distances matrix was analyzed with MDS. 


7.2.1 Geographic variation 


In the first MDS analysis, average values for each site (that is, the averages of 
the approximately twelve speakers) were calculated for the acoustic variables before 
measuring the linguistic distances between sites and applying MDS. The distance 
matrix comprised the pairwise distances between all 98 sites. 

Figure 7.1 shows the results of the MDS in a two-dimensional coordinate system 
where the gray-scale color of the dot represents the third dimension. One dimension 
explains 81.4% of the variance, two dimensions 93.6% and three dimensions 96.3%. 
It is clear that the first dimension already explains a very large part of the variance. 
Using more than three dimensions would only mean a small improvement of the 
variance explained. The plot does not show any clear clusters of sites, but there is 


@Ldéderup 
@Norra Rérum 
Bara e@Jamshég 
@ Overkalix Véssis 
@Sproge @Fole 
@Broby 
@Nederlulea @Far6 Brand6é Boraa 
ONarpes —@ Arstad—-Heberg @ @Borga F 
x BS @Saltvik 
@Pitea @Nysatra TorSamn a ee : 
imforsa orunda 
Burseryd — B6d. ae ae Jarsnas.. A © Snappertina 
e@Vaxtorp Bredsatra® | Burtrask® regia ect anna — @ Villberga 
Kramfors g¥indeln@,, oe Viby @Jarnboas arsta® 8 Gra 
ne rst Qa Asby @ Vastra Vingaker @ Nog oraso 
@Vvaickelsang iis oases age sidnnskatteberg ag O@Skuttunge 
ArsundaHammard© Vora ut rf ¥ _ Dragsfjard 
Skog@ Geeanet @Stora Mellésa 
Oo OcBlp 6 @Frandefors 
@TOrP @ Ragunda @Leksan @ Oxaback@ Torsé 
e@Vilhelmina Kalix @Are ‘~ oSignberga @Malung 
@Strémsund Floby, @Orust 
Korsbergs Farila @Aspas 
Fjallsj6 6 Finnsk 
@Storsjé @Anundsi.¢ @Fjalls) @sddra anne oga 
@Delsbo @Bengtsfors © @ s sborn® Sama 
gvemdalen Houtskar 
Dalby 
@Skee 
e@Lillnardal 


Figure 7.1. Results of MDS to three dimensions of linguistic distances between 
sites. The first dimension is represented by the x-axis, the second dimension by the 
y-axis and the third dimension by the gray-scale color of the dot. Because of the 
density of sites close to the origin, a few labels have been omitted. 
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only one big cloud, which supports the result of the Gap statistic that the Swedish 
dialects form a true continuum. The plot shows a concentration of sites close to the 
origin (intersection of the axes) and a more sporadic distribution in the peripheries. 

The geographic distributions of the three dimensions can be viewed more easily 
in maps. Figure 7.2 displays the values of the sites on each of the three dimensions 
on maps. A one-dimensional color spectrum is used for displaying the values (Ap- 
pendix B, § B.3). Magenta means high values and green means low values. In the 
first dimension, sites in Svealand (mainly Uppland) and on the Finnish south coast 
have high values, while mainly sites in Norrland have low values. Sites with high 
values in the second dimension are the South Swedish ones, but also the ones in 
Gotland and Overkalix (Norrbotten). Sites in the west of Norrland and Svealand 
have low values in the second dimension. The third dimension separates sites in 
Finland (magenta on the map in Figure 7.2, light shades in the plot in Figure 7.1) 
from the ones in Uppland (green on the map, dark in the plot), and the Gotlandic 
(magenta on the map, light in the plot) from the South Swedish ones (green on the 
map, dark in the plot). Narpes (Osterbotten) has an extremely high value (white in 
the plot) in the third dimension. 

The last map in Figure 7.2 shows the two first dimensions together, visualized 
with a two-dimensional color scheme (Appendix B, § B.2). As mentioned above, 
these two dimensions already explain 93.6% of the total variance in the data. The 
map displays the same thing as the positions of the sites in the plot in Figure 7.1 
without taking the color of the dot in the plot into account. The sites in Uppland and 
on the Finnish south coast, with high values in the first dimension and intermediate 
values in the second dimension have a light yellowish color. South Swedish sites 
and sites on Gotland have intermediate values on the fist dimension and high values 
in the second dimension, which leads to light blue colors on the map. The sites 
in Norrbotten also have quite high values in the second dimension, but very low 
values in the first dimension, which gives a darker blue color on the map. The third 
quadrant in the plot in Figure 7.1 (negative values on the two first dimensions) is 
dominated by sites in Norrland. These sites have dark colors on the map. The sites 
that are found close to the origin of the plot have grayish colors on the map. Many 
sites in G6taland have grayish colors. Skee (Bohuslan) is an outlier in the corner of 
the fourth quadrant of the plot (positive values in the first dimension and negative 
on the second), which gives a clear yellow color in the map. In the fourth quadrant 
other sites close to the Norwegian border are found as well. 

Figure 7.3 shows the results of the MDS on a map using the full RGB color model, 
where each of the three dimensions is represented by a separate color; dimension one 
by the amount of red, dimension two by green, and dimension three by blue (see 
Appendix B, § B.1). In this map, the southernmost province, Skane, forms a very 
coherent area with low values in the first and third dimensions and high values in 
the second dimension leading to green color. The separation of the South Swedish 
varieties from the ones on Gotland in the third dimension can be seen in colors close 
to cyan on Gotland. Uppland is also a very coherent area with an orange color. 
Red and purple colors are found mostly close to the Norwegian border. In Norrland 
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dim 3 dim 1-2 


Figure 7.2. Results of MDS to three dimensions of linguistic distances between 
sites. All three dimensions visualized separately, green = low values, magenta = 
high values. The two first dimensions are also visualized together in the lower right 
map. 
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Figure 7.3. Results of MDS to three dimensions of linguistic distances between 
sites. Three dimensions visualized in one map with the RGB color model. 
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mostly dark green colors are found, but also other dark colors and blue. Gotaland 
is quite incoherent with different colors from the center of the color spectrum. In 
Finland there is a clear difference between the sites on the south coast and the 
west coast. Saltvik (Aland) is much more similar to the sites in Uppland than to 
Finland-Swedish varieties. 

The map where all three dimensions are represented shows that even if the dis- 
tribution of dialectal features is continuous, some more coherent dialect areas can 
be detected. 


7.2.2 Analysis based on age 


In the following step age-related variation was analyzed in addition to the geographic 
variation. The distance matrix that MDS was applied to comprised the pairwise 
linguistic distances between 196 objects (2 age groups x 98 sites). One dimension 
explains 78.9% of the variance, two dimensions 92.3% and three dimensions 95.9%. 

Figure 7.4 shows the results of the MDS in a two-dimensional coordinate system 
where the gray-scale color of the dot represents the third dimensions. The objects 
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Figure 7.4. Results of MDS to three dimensions of linguistic distances between 
sites and age groups. The first dimension is represented by the x-axis, the second 
dimension by the y-axis and the third dimension by the color of the dot. O = older 
speakers, Y = younger speakers. 
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old young 


Figure 7.5. The first dimension of MDS of linguistic distances between sites and 
age groups. Green = low values, magenta = high values. 


form one big cloud, except for one outlier, which for some reason has an extremely 
high value in the second dimension. This outlier is the younger speakers of Loderup 
(Skane). The labels of the objects do not fit into the plot, but for each object a letter 
indicates whether the dot concerns older or younger speakers. As can be seen, the 
second dimension mainly seems to separate the two age groups. The older speakers 
mostly have low values in the second dimension, while younger speakers have high 
values. 

For getting a better picture of the distribution of the values in each dimension 
one-dimensional maps were created. Figure 7.5 shows the values in the first di- 
mension of the older and younger speakers of each site. The extremely high value 
of the younger speakers of Léderup in the second dimension would mean that a 
large proportion of the color representing the second dimension would be required 
for representing this variety. In order to produce more separation between the other 
varieties the young speakers of Léderup were left out of the color visualizations. The 
maps show that in this analysis sites in Svealand and on the Finnish south coast are 
assigned low values in the first dimension while sites in Norrland have high values 
in the first dimension. This is roughly the reverse of the analysis per site in the 
previous section. However, in MDS the directions of the axes are arbitrary and may 
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old young 


Figure 7.6. The second dimension of MDS of linguistic distances between sites and 
age groups. Green = low values, magenta = high values. 


be rotated (Legendre & Legendre, 1998, 445), which means that the first dimension 
in both analyses roughly represents the same thing. The maps of older and younger 
speakers look quite similar. 

As already shown by the scatter plot in Figure 7.4, the second dimension mainly 
separates older and younger speakers. The maps of the second dimension in Fig- 
ure 7.6 confirm the picture of the scatter plot. However, for some of the peripheral 
dialects (South Swedish, Gotland, Finland, Norrland) there is not such a large dif- 
ference between older and younger speakers in the second dimension. 

Figure 7.7 shows that high values in the third dimension are assigned to South 
Swedish sites, Gotlandic and all of the Finland-Swedish sites except for Aland. The 
older speakers in Norrbotten have high values, while young speakers in Norrbotten 
have low values in the third dimension. The lowest values in the third dimension 
are found in Jamtland. The third dimension could, hence, be interpreted as a peri- 
pherality dimension. 

The maps in Figure 7.8 display all three dimensions of the MDS of older and 
younger speakers per site simultaneously using the whole three-dimensional RGB 
color spectrum. The two age groups are displayed on separate maps, but the colors 
of the two maps are comparable since they are based on one single MDS analysis 
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Figure 7.7. The third dimension of MDS of linguistic distances between sites and 
age groups. Green = low values, magenta = high values. 


(one distance matrix with the pairwise distances between older and younger speakers 
of all sites was analyzed). Since the first dimension is more or less the inverse of the 
first dimension in the previous section, where geographic variation was analyzed, the 
red color representing the first dimension was reversed in order to obtain a similar 
color representation to the one in Figure 7.3. 

The difference between the map of the older speakers and the map of the younger 
speakers is striking. In the map of the older speakers a broad spectrum of colors is 
found, while the map of the younger speakers is dominated by green. This shows a 
large-scale on-going leveling of the Swedish dialects. The maps visually confirm the 
result of the t-test in § 6.2.1, according to which the linguistic distances between 
sites are significantly shorter for younger speakers than for older speakers. 

By comparing the colors of older and younger speakers in Figure 7.8 conclusions 
can be drawn about which dialects are undergoing the biggest change when it comes 
to vowel pronunciation. For example, the sites in Finland have much more similar 
colors for older and younger speakers than many of the sites in Sweden. In order to 
get a more apparent view of which dialects that seem to be changing and which are 
stable, maps visualizing only the within-site-distances were created. In Figure 7.9 
the distances between sites are disregarded and only the distances between older and 
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old young 


Figure 7.8. MDS to three dimensions of linguistic distances between sites and 
age groups. Both age groups were included in one single MDS analysis, and are 
represented within the same color spectrum making the colors of the two maps 
comparable with each other. 


younger speakers at each site are displayed. The map to the left shows the distances 
visualized with a continuous one-dimensional color spectrum (Appendix B, § B.3). 
The sites with the shortest aggregate distance between older and younger speakers 
are green, while sites with a large distance between older and younger speakers are 
magenta. 

For getting an even more distinct picture, the sites were divided into three groups 
using K-means clustering. The map to the right in Figure 7.9 shows three groups 
obtained by clustering sites with the most similar distances between the two age 
groups. When three distinct groups are formed, all sites with only a relatively 
small distance between older and younger speakers are clear green, sites with a large 
distance are magenta, and all sites with intermediate distances between older and 
younger speakers are pure gray. 

Dialects in the South Swedish area, on the islands Oland and Gotland, and in 
Finland are green, and hence have small average distances in vowel pronunciation 


3K-means is the most commonly used clustering algorithm for partitioning data. The user 
decides how many groups should be formed, and the algorithm partitions the most similar items 
into groups by minimizing the total error sum of squares (Legendre & Legendre, 1998, 349-355). 


7.2. Dialect continuum 141 


i small distance 
i intermediate distance 
i large distance 


M 


Figure 7.9. Maps displaying the aggregate distances between older and younger 
speakers at each site. In the map to the left a continuous color spectrum is used, 
where green indicates the shortest distance and magenta the largest distance. In 
the map to the right a more distinct picture is obtained by clustering the sites into 
three groups. 


between older and younger speakers. The same holds for many sites around lake 
Vanern. These peripheral sites seem relatively stable when it comes to vowel pro- 
nunciation. Many of the more central sites around and south-west from Stockholm 
are gray or magenta, which suggests a big ongoing change in vowel pronunciation. 
This is also the case for sites close to G6teborg on the west coast. In Norrland there 
are sites of all three types: some dialects show a large ongoing change, some an 
intermediate change, and some are relatively stable. 

In the map of the younger speakers in Figure 7.8 it looks as if there is almost no 
variation in vowel pronunciation between younger speakers. This is not entirely true. 
The variation between younger speakers is only so much smaller than between older 
speakers and between the two generations that only a small part of the color spec- 
trum can be used for showing the differences between younger speakers at different 
sites. 

In order to be able to visualize dialectal differences within the younger age group, 
MDS was also applied separately to the older and the younger speakers. That is, 
two separate distance matrices were analyzed, one with the distances between older 
speakers at all sites and one with the distances between younger speakers at all sites. 
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old young 


Figure 7.10. MDS to three dimensions of linguistic distances between sites for each 
age group separately. The maps are based on two separate MDS analyses, so that 
the full color spectrum is used in each of the maps. The colors are not comparable 
across the two maps. 


The analysis of only older speakers included 98 sites and the amount of explained 
variance was 82.1% for the first dimension, 93.4% for two dimensions and 95.7% for 
three dimensions. The analysis of only younger speakers included 97 sites (younger 
speakers in Loderup were left out) and the amount of explained variance was 83.8% 
for the first dimension, 92.4% for two dimensions and 96.1% for three dimensions. 
Figure 7.10 shows the maps with results of MDS applied separately to the older 
and younger speakers. Since two separate analyses are displayed in the two maps, the 
colors of the maps have to be interpreted independently.4 The maps are similar to 
the ones in Figure 7.8, but the colors in each map are more distinct. When the whole 
color spectrum is used for each age group separately it becomes clear that there are 
differences across sites among the younger speakers, which could not be distinguished 
in Figure 7.8. Moreover, the geographic pattern is quite similar for older and younger 
speakers. So even if the dialectal differences in vowel pronunciation are larger in the 


4For example, Gotland has completely different color in the two maps. This does not mean that 
there is a large difference in vowel pronunciation between older and younger speakers on Gotland, 
it is only an effect of the full color spectrum being used to visualize the distances within each age 
group. 
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older generation than in the younger, the geographic distribution of dialectal features 
remains more or less the same. 

Some differences in the geographic distributions can also be found. For example, 
the dialects in Norrland are more coherent in the younger age group than in the older. 
Figure 7.9 showed that some dialects in Norrland are relatively stable, while others 
have a large linguistic distance between older and younger speakers. In Norrland the 
most divergent dialects seem to be changing the most, and thereby a more uniform 
spoken variety of Norrland is emerging. This can be seen as regionalization of the 
dialects, since the most divergent dialectal features seem to be disappearing while 
some other features that distinguish Norrland varieties from other Swedish varieties 
are preserved. 


7.2.3 Interpreting MDS dimensions 


Because MDS is based on a distance matrix with pairwise aggregate distances 
between varieties, MDS does not offer any direct way to interpret which of the 
original linguistic variables that have caused the distribution in the extracted di- 
mensions. In factor analysis (FA), however, the loadings indicate the correlation 
between original variables and extracted factors. In § 6.3 the average acoustic val- 
ues per age group per site were analyzed by means of FA. The same division of the 
data into older and younger speakers per site was used in the MDS in § 7.2.2, which 
makes a comparison between MDS and FA possible. 

The values of the objects in the three dimensions of the MDS were correlated 
with the scores of the FA using Pearson’s correlation. The correlations are displayed 
in Table 7.1. The first dimension corresponds well to the first factor (r = 0.921) and 
the second dimension correlates highly with the third factor (r = 0.840). The third 
dimension does not show very high correlations with one single factor, but seems 
to be a combination of several of the factors. Most strongly the third dimension 
correlates with the fifth (r = 0.468) and the ninth (r = —0.502) factor. 

Since the FA includes loadings that tells us which of the linguistic variables are 
connected to each factor, a linguistic interpretation of the MDS dimensions can be 
inferred from the correlations with the factors. Based on the loadings of the FA, the 


Table 7.1. Pearson’s correlations dim1 dim2 dim3 
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factor10 — — —0.218 


144 Chapter 7. Aggregate analysis 


conclusion can be drawn that the first dimension of the MDS is largely based on 
differences in the PC2 values of front vowels (compare with the loadings of the first 
factor of the FA in Table 6.5, p. 111), which is assumed to be connected to voice 
quality differences. 

The second dimension of the MDS correlates highly with the third factor of the 
FA, and hence has to do with differences in the PC1 values of front mid vowels 
(Table 6.7, p. 116). The FA showed that these vowels are lowered by younger 
speakers in many areas. 

The third dimension is more unspecific and shows the effect of several variables, 
which have different geographic distribution patterns according to the FA. It seems 
that varieties with extreme scores in the third dimension of the MDS are not neces- 
sarily linguistically very similar to each other, but they are characterized by dialectal 
features that make them divergent from more central varieties. 

The effect of the second factor of the FA, which distinguishes South Swedish 
varieties from the rest, is spread over all three dimensions of the MDS (significant 
correlations between 0.2 and 0.3 with all three dimensions). 

The sixth factor of the FA, which detected only very subtle differences between 
dialects (see § 6.3.7), does not correlate significantly with any of the MDS dimensions. 


7.2.4 Analysis according to age and gender 


In the final MDS analysis the data from every site were divided into four groups: 
older men, older women, younger men, younger women. This resulted in a 390 x 
390 distance matrix (speaker groups with less than 15 vowels were not included, see 
Appendix A). One dimension explains 82.5% of the variance, two dimensions 93.0% 
and three dimensions 95.9%. 

Figures 7.11-7.13 display one-dimensional maps of the three first dimensions 
separately, with one map for each speaker group. Green color indicates low values 
and magenta high values (see Appendix B, § B.3). 

The solution is similar to the one in § 7.2.2, only with the second and third dimen- 
sions reversed. Low values in the first dimension (Figure 7.11) were assigned mainly 
to Svealand and the Finnish south coast, even though the geographic distribution is 
not completely coherent. 

The second dimension (Figure 7.12) mainly separates older and younger speakers. 
At most sites, the younger speakers have low values, while high values are found 
among older speakers (the inverse of the second dimension of the MDS in § 7.2.2). 
In the second dimension the older women have somewhat higher values on average 
than the older men. 

As in the analysis in § 7.2.2, the third dimension (Figure 7.13) could be called a 
peripherality dimension. South Swedish, Gotlandic and Finland-Swedish sites and 
the older speakers in Norrbotten have low values, while speakers in Jamtland have 
high values (the inverse of the third dimension of the MDS in § 7.2.2). 

Figure 7.14 displays the four maps that combine all three dimensions using the 
RGB color model. All three colors were reversed to obtain a color representation 
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similar to the ones in the previous sections (Figure 7.8 and Figure 7.3). From these 
maps, it is obvious that the differences in vowel pronunciation are larger between 
the two generations than between men and women. The map of the older women 
is darker than the one of the older men, mainly due to the second dimension. As 
in the analysis in § 7.2.2, we can see that the dialectal differences are smaller in the 
younger generation than in the older, disregarding the variation across genders. The 
maps of both the younger men and the younger women are dominated by green, 
while the maps of the older speakers show a broader color spectrum. The maps of 
the older speakers show roughly the same geographic patterns that were detected in 
the two previous sections, while the color spectrum used for the younger speakers 
is so narrow that almost no geographic patterns can be detected. For the site Skee 
(Bohuslan) it can be noted that the younger men have a red color, like older speakers, 
while the young women have a color much more similar to the surrounding dialects. 


7.3 Conclusions of the aggregate analysis 


In this chapter the relationships between varieties of Swedish were analyzed using ag- 
gregate linguistic distances based on acoustic measurements of 19 vowels. The Gap 
statistic showed that the data cannot be partitioned into groups, but the Swedish 
dialects form a continuum without abrupt borders. This is in line with previous de- 
scriptions of the Swedish dialects. For visualizing the continuum-like relationships 
between varieties of Swedish, multidimensional scaling (MDS) was used. Five differ- 
ent analyses were carried out using MDS. In § 7.2.1 the linguistic distances between 
sites were analyzed. In § 7.2.2 the data were divided into older and younger speakers 
per site and three analyses were made: one including both older and younger speak- 
ers, one with only older speakers, and one with only younger speakers. In § 7.2.4 a 
further division according to gender was made and an analysis which included four 
speaker groups per site was carried out. 

The analysis of linguistic distances between sites showed that even if the distri- 
bution of dialectal features is continuous, some more coherent dialect areas can be 
detected. 

The analysis of the two age groups in the data set showed that the dialectal 
differences are considerably smaller in the younger generation than among older 
speakers. The effect of dialect leveling in apparent time is large. However, the 
geographic distribution of dialectal features is not changing much, so that the main 
dialect areas remain the same. This can be interpreted as regionalization of the 
dialects, since it seems that dialectal features that are characterizing the larger 
dialect regions are still being preserved at the same time as the overall linguistic 
distances become smaller. 

An analysis of the distances between the two age groups within each site showed 
that the central dialects, close to the biggest cities, seem to be changing the most, 
while many of the more peripheral dialects are relatively stable when it comes to 
an aggregate analysis of vowel pronunciation. This can seem surprising, because 
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in a dialect leveling situation you might expect that the most divergent peripheral 
dialects would be converging to more central and standard-like varieties. In order to 
understand why the central dialects are changing the most this result of the aggregate 
analysis has to be studied in relation to the change in each of the variables. This is 
discussed further in § 8.2.2. 

A further division according to gender showed that the differences between older 
and younger speakers is much larger than the difference between men and women. 
In the older generation there seems to be more gender-related differences than in 
the younger generation. 

The results obtained by MDS could be correlated with results from the factor 
analysis (FA) in the previous chapter. This showed that the first dimension, which 
separates Svealand and south Finland-Swedish varieties in all the MDS analyses, is 
largely explained by the PC2 values of front vowels. The second dimension, the one 
producing the largest distance between older and younger speakers, is mainly an 
effect of the lowering of front mid vowels by younger speakers. The third dimension 
of the MDS is a peripherality dimension separating peripheral areas in the Swedish 
language area, like Skane, Gotland, Finland, Norrbotten and Jaémtland. Variables 
characterizing peripheral dialects were spread on several factors with different geo- 
graphical distributions in the FA. 
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old women 


young men young women 


Figure 7.11. The first dimension of MDS of linguistic distances between sites and 
speaker groups based on age and gender. Green = low values, magenta = high 
values. 
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old women 


young men young women 


Figure 7.12. The second dimension of MDS of linguistic distances between sites 
and speaker groups based on age and gender. Green = low values, magenta = high 
values. 
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old men old women 


young men young women 


Figure 7.13. The third dimension of MDS of linguistic distances between sites 
and speaker groups based on age and gender. Green = low values, magenta = high 
values. 
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old women 


young men young women 


Figure 7.14. Results of MDS to three dimensions of linguistic distances between 
sites and speaker groups based on age and gender, visualized with the RGB color 
model. All speaker groups are represented within the same color spectrum making 
the colors of the maps comparable with each other. 


Chapter 8 
Discussion 


Below, the results from the three previous chapters are discussed and related to 
each other. In § 8.1 the acoustic method is evaluated. In § 8.2 the dialectological 
conclusions about Swedish dialects that can be drawn taking both the analysis on 
the variable level and the aggregate level into account are discussed. Finally, in 
§ 8.3, the strengths and limitations of analysis of variables and aggregate analysis 
are discussed. 


8.1 Acoustic analysis 


The most common method used in variationist linguistics for assessing vowel quality 
acoustically has been formant measurements. In this study another approach was 
chosen. Vowel spectra were filtered with Bark filters up to 18 Bark and subsequently 
the filter bank representation was reduced to articulatory meaningful principal com- 
ponents (PCs) by means of principal component analysis (PCA). The method has 
previously been used for large-scale analysis of geographic and social variation in 
Dutch vowel pronunciation by Jacobi (2009). 

Bark filters correspond to the auditory filters of human hearing, which means 
that a Bark filter representation of vowels models human perception. Formants, on 
the other hand, are resonant frequencies in the vocal tract and measuring formants 
hence is an articulatory model. A strong association between the two models was 
shown by high correlations between PCs and formants in § 5.2.1. The correlation 
between the second component (PC2) and F2 was, however, somewhat weaker than 
the one between the first component (PC1) and F1. 

Bark filtering can be automated more reliably than formant measurements. Auto- 
mated formant measurements always include wrong values that have to be corrected 
manually. In addition to the perception-based merits of a filter bank representation, 
the load of manual work was reduced significantly by the choice of method for acous- 
tic analysis. Nonetheless, one should bear in mind that a considerable amount of 
manual work was needed, too, to make this study possible. Preceding the acoustic 
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analysis, all the vowel data had been manually segmented in the SweDia project. 

How to reduce the effect of speaker-dependent variation is a problem for all stud- 
ies dealing with acoustic speech samples (see §§ 2.4.3 and 2.4.4). Due to anatom- 
ical/physiological differences, the overall size of the vowel space varies significantly 
across speakers. A number of normalization procedures have been developed for 
formant measurements, but none of them work well when one wants to compare 
vowels from varieties that are not phonologically comparable. Jacobi (2009) solved 
the problem by relating the measurements of each vowel to the speakers point vow- 
els /i/ and /a/. Jacobi studied the variation in Dutch diphthongs and long vowels. 
Relative measures of vowel quality could be used because the Dutch point vowels 
are considered to be stable across all varieties of Standard Dutch. 

Among the Swedish vowels, stable point vowels could not be found for all speak- 
ers, which excluded the possibility to use a relative measure of vowel quality. Instead, 
speaker-dependent variation was evened out by averaging over a number of speakers 
per variety. However, the number of male and female speakers was not equal for 
each variety, which meant that the systematic differences in the vowel spaces of male 
and female speakers due to anatomical /physiological differences had to be removed 
in order not to bias the results. A normalization of the differences between male and 
female voices was obtained by applying PCA separately to vowels produced by men 
and women. This procedure effectively removed differences in PC scores between 
men and women (§ 5.1.5). Of the point vowels only [u:] showed a significant dif- 
ference on one of the two extracted PCs after applying PCA separately to men and 
women. Before normalization, that is, when including vowels produced by men and 
women in one single PCA, all four point vowels ([i:], [gex], [a:]/[a:] and [u:]) showed 
significant differences between men and women in one or two of the extracted PCs. 
This possibility to normalize for the effect of speaker-sex in the PCs, something 
which has been notoriously difficult in formant measurements, turned out to be a 
big advantage of the acoustic method chosen. 

Using a so-called whole-spectrum method undoubtedly includes more informa- 
tion from the acoustic signal than only cues directly related to the articulation of 
vowels. The signal-to-noise ratio in the recordings has been shown to influence PCs 
extracted from band-pass filtered spectra both in the present study (dialect speakers 
recorded in their homes vs. speakers of Standard Swedish in a studio, see § 5.1.6) 
and by Jacobi (2009, 59-63). When using this method one should therefore either 
pay attention to all recordings being made in as similar conditions as possible, or 
alternatively find some stable point vowels which can be used for normalizing for 
the effect of noise. 

Regional differences were detected in the PCs that did not seem to be connected 
to formants and vowel articulation. This spectral feature was connected to PC2 
of all front vowels and an assumption is that voice quality differences would have 
caused this regional variation in the PCs. The exact nature of the spectral feature 
connected to PC2 of front vowels could, however, not be verified in this study, but 
should be studied further. A factor analysis applied to the data seemed to be able 
to separate this spectral feature from other variation in the data (§ 6.3.2). 
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In the analyses of dialectal variation, average values of the PCs for each variety 
were used. For example, in the analysis of pure geographic variation, average values 
were computed for each site (that is, average PCs of twelve speakers for most of the 
sites). In a number of analyses social variation was accounted for as well. When the 
data was split up into two age groups the groups on average included six speakers, 
while a further division according to gender led to even smaller groups (on average 
three speakers per gender per age group per site). Since all speaker-dependent 
variation could not be removed from the acoustic PCs, the influence of individual 
differences in the overall size of the vowel space on the group averages is greater 
the smaller the group is. Especially in the division into four groups per site (older 
women, older men, younger women, younger men), where each group included on 
average only three speakers, some caution should be taken when interpreting the 
results. Still, also this division into the smallest groups showed very similar results 
to the other analyses. Most notably the differences between the two age groups 
was much greater than the differences between men and women, with especially 
the younger women and younger men showing very similar results. This can be 
seen as an additional confirmation of a successful reduction of differences related 
to the anatomical/physiological differences between men and women in the acoustic 
measure. 

The PCs extracted from the Bark-filtered vowel spectra were used in a number 
of analyses of dialectal variation (Chapters 6 and 7). Several features which have 
previously been described in the Swedish dialect literature were identified. The 
results from previous studies could hence be supported by acoustic data and the 
geographic distribution of dialectal features across a large number of sites in the 
Swedish language area could be established. The PCs offered an interpretation 
of the data in terms of vowel height and advancement. A complete articulatory 
description of vowels can, however, not be inferred from the PCs, since for example 
vowel roundness is not represented in a simple way in the PCs. 

Since using PCA of band-pass filtered vowel spectra can be automated more 
reliably than formant measurements, it can be regarded as well-suited for large-scale 
analyses of phonetic variation in vowel quality. Moreover, the method turned out 
to offer a possibility to normalize for the systematic difference between male and 
female voices, something which has always been regarded as difficult in formant 
measurements. The perceptual and articulatory correlates of PCs of Bark-filtered 
vowel spectra should still be studied further. Especially the variance in PC2, which 
was somewhat less dependent of formants than the variance in PC1, should be 
studied in more detail in future research. 


8.2 Dialectological results 


This section includes a discussion of the dialectological results of Chapters 6 and 7. 
The dialect areas identified in the aggregate analysis are described linguistically in 
§ 8.2.1 below, and in § 8.2.2 some explanations to the observed language change are 
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discussed. Before going on with the dialectological results some general issues about 
the representativity of the data should be addressed. 

Language change was studied by comparing an older and a younger group of 
speakers at every site. A question to be asked is whether the observed differences 
between speaker groups in apparent time correspond to a real linguistic change. In 
the SweDia project the age range of the younger speakers was chosen so that the 
language recorded would not be the youth language of teenagers, which is likely to 
change when the speakers get older. A somewhat older group comprising speakers 
in their 20s or early 30s was recorded. Nordberg (2005, 1765) has pointed out that 
several studies from the Swedish area have shown that “young adults and people 
in early middle age stand out as comparatively more standard-speaking than other 
age groups, probably because this is the period of life when careers are built up and 
the values of the larger society become important.” On the other hand, Sundgren 
(2002) studied language change in Eskilstuna with access to language data for both 
an apparent time and a real time study. She could conclude that both the study in 
apparent time and the study in real time showed the same development and that 
the two methods would not lead to different conclusions about ongoing changes. 

Another concern would be to what extent the recording situation would influence 
the dialect speakers and whether older and younger speakers would make different 
accommodations to the language of the interviewers. The risk of the speakers being 
influenced by the speech variety of the interviewers is especially high in sites where 
the speakers are used to switch between or use a gliding scale between the local 
dialect and Standard Swedish. At some locations the interviews were carried out by 
a speaker of the local dialect while at other locations the interviewers talked a variety 
representative for a larger region. In all cases, the speakers were encouraged to think 
about how the words would be pronounced in the local vernacular. The impression is 
that this strategy was successful in most cases. At some sites, however, the resistance 
against speaking the local dialect to a stranger was exceptionally strong. This was 
perceived to be the case at least for many of the speakers in Snappertuna in Finland. 
The speakers from this particular site should therefore be regarded as speakers of the 
regional standard language and not of the local dialect. It cannot be ruled out that 
the dialect speakers have made linguistic accommodations towards the language of 
the interviewers at some other sites, too. 

One should keep in mind that the varieties analyzed are a sample of 98 sites from 
the Swedish language area. All possible linguistic variation can therefore not be 
accounted for. A few sites which are well known for their well-preserved divergent 
rural dialects (Orsa and Alvdalen) have been excluded from the current analysis. 
This is because they were already considered so different from other dialects during 
the SweDia fieldwork that a completely different word list was used for eliciting 
vowel sounds in these dialects. 
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8.2.1 Dialect areas 


The results of multidimensional scaling (MDS) to three dimensions in § 7.2.1, where 
average linguistic distances between sites were analyzed, showed that even if the 
distribution of dialectal features is continuous, some more coherent dialect areas 
can be detected. The map in Figure 7.3 (p. 135) displaying the MDS results is 
very similar to the traditional divisions proposed by Wessén (1969) and Elert (1994) 
(Figure 2.2, p. 14). 

Acknowledging the fact that the borders between the Swedish dialect areas are 
not abrupt, but form a continuum, Wessén (1969) did not propose any borders 
between dialect areas, but only gave a rough sketch of a dialect division. This is 
similar to what the map displaying the results of MDS shows. There are definitely 
areas within the Swedish language area which differ from each other considerably 
when it comes to vowel pronunciation, but between these areas there are no abrupt 
borders, only gradual transitions. 

Wessén (1969) classified the Swedish traditional rural dialects, while the clas- 
sification of Elert (1994) was one of regional varieties of Standard Swedish. The 
classification by Wessén (1969) was based on phonetic, phonological and morpho- 
logical features while Elert (1994) used mainly intonation and differences in vowel 
pronunciation for grouping varieties of the Swedish spoken language. Both scholars 
considered vowel pronunciation an important characteristic for dialects and regional 
varieties of Swedish. The data of the present study is collected only at rural sites, 
but the data is about a hundred years younger than the data that Wessén (1969) 
worked with. Because of the large-scale dialect leveling that has affected Sweden 
during the last half of the 20th century, traditional rural dialects have been preserved 
only in very few areas. The data of this thesis include more dialectal features than 
the varieties in Elert’s classification, but more leveled dialects than the ones that 
Wessén wrote about. The varieties studied in this thesis could be called modern 
rural varieties of Swedish. The results in § 7.2.2 showed that even if there is consid- 
erable dialect leveling going on in the Swedish language area, the geographic areas 
that can be identified are still very similar for older and younger speakers. Based on 
the analysis of the separate vowels in Chapter 6 the areas detected by the aggregate 
analysis in Chapter 7 can be described linguistically: 

The most prominent feature of South Swedish varieties is the diphthongization 
of long vowels. The close long vowels in dis, typ, lus and sot have the strongest 
diphthongization, which was identified by the second factor of the factor analysis 
(FA) in § 6.3. But also the mid vowels in leta, nat, st, dér and lat are diphthongized, 
especially in Skane. The South Swedish varieties have relatively high PC2 values 
(that is, a more fronted pronunciation than Standard Swedish) in the long and short 
a vowels in lat and lass. 

In Gétaland a more close pronunciation than in Standard Swedish was noted for 
the vowels in dér and ldr. This is also the case for younger speakers to a greater 
extent than in most parts of the language area. The open pronunciation of sé¢ which 
is spreading among younger speakers in central Sweden is only found in the coastal 


156 Chapter 8. Discussion 


areas of Gotaland but not land inwards. In many sites in Gétaland a more open 
pronunciation than in Standard Swedish is found for the vowel in lus. Many younger 
speakers also have a relatively open pronunciation in dis and typ. On the west coast 
and in Dalsland the pronunciation of the disk vowel is more open than elsewhere. 
In Smaland and Ostergétland a diphthongization of the vowel in sot can be found. 

Svealand is characterized by a spectral feature which distinguishes it from the 
varieties in G6taland and Norrland. This spectral feature, identified by the first 
factor of the FA (§ 6.3.2), might be related to voice quality, and should be studied 
further. Uppland is characterized by a very close pronunciation in ndt and an 
open pronunciation of the vowels in ldr and dér. However, younger speakers in 
Uppland have a more open pronunciation in ndt compared to the older speakers. 
An open pronunciation in sdét, which is not common among older speakers, is used 
by younger speakers in the whole East Central Swedish area. In Narke a relatively 
open pronunciation of the vowels in dis and typ is found for younger speakers. 

Features that are found mainly in sites close to the Norwegian border are an open 
pronunciation of the vowel in flytta and for the more northern varieties a fronted 
pronunciation in lott. A relatively close pronunciation in ldér and dér is found for 
both older and younger speakers in this area. 

In Norrland large linguistic distances between dialects are still found. However, 
in the most divergent areas, Norrbotten and Jaémtland, the younger speakers show a 
considerable convergence to Standard Swedish. Among older speakers in Norrbotten, 
both Proto-Nordic diphthongs and secondary diphthongs are still found, but most 
younger speakers are not using the diphthongs. For many sites in Norrland a more 
open pronunciation than in Standard Swedish is found in Jus and a relatively fronted 
pronunciation of the a vowels in lat and lass. 

The varieties in the southern parts of Finland share a spectral feature (which 
might be connected to voice quality) with dialects in Svealand. In most varieties 
in Finland the vowel in Jus is a central vowel and not a front vowel as in Standard 
Swedish. Most sites, but not all, have a relatively close vowel in ndt and an open 
vowel in lér. Proto-Nordic diphthongs are found in Houtskar (Aboland) and in 
Osterbotten and secondary diphthongs primarily in Osterbotten. A more fronted 
pronunciation than in Standard Swedish in Jat is found especially in the south of 
Finland. 

Gotland is mainly characterized by the rich number of diphthongs—both Proto- 
Nordic and secondary diphthongs are found. The pronunciation of the vowels in lér 
and dér is open, and the vowel in lat is more fronted on Gotland than in Standard 
Swedish. 

Two features which are considered characteristic for the vowel pronunciation of 
regional varieties of Swedish were not identified in the analyses in this thesis. These 
are Central Swedish diphthongization (§ 2.3.2.2) and the occurrence of a semi-vowel 
or fricative ending in long close vowels in central Sweden (§ 2.3.2.4). The reason 
that these features were not identified in the present data set could have to do with 
the choice of sampling points in the vowel segments. The first sampling point was at 
25% of the vowel duration and the last at 75%, so it is possible that diphthongization 
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in the very first and last part of the vowel segments is missed. However, Central 
Swedish diphthongization is considered prosodically conditioned and is strongest in 
stressed vowels and in the end of sentences, while the semi-vowel or fricative ending 
in vowels is most noticeable in word final position and before another vowel. Due 
to these facts the two types of diphthongization might not be prominently present 
in the data set. The speakers were pronouncing words in isolation, which would 
exclude many prosodic features, and all vowels were in a C__C context, which is not 
favorable for the semi-vowel or fricative ending. 

Comparison with results from other studies where data from the SweDia data- 
base have also been used, makes it possible to draw conclusions about associations 
between different linguistic levels. The intonational typology by Bruce (2004) (see 
the description in § 2.2.3) largely corresponds to the dialect areas described by 
Wessén (1969) and Elert (1994), and hence also to the areas detected in the present 
thesis. The intonational variation and the variation in vowel pronunciation seem to 
have very similar geographic distributions. 

Schaeffler (2005) made a typology of phonological quantity in Swedish dialects, 
also using SweDia data. The three main types identified by Schaeffler form rather 
different geographic areas than the ones identified based on vowel pronunciation. In 
Schaeffler’s study, Sweden was divided into a southern and a northern area with the 
border between the two areas approximately following the border between Svealand 
and Norrland. A transitional dialect border between Svealand and Norrland can be 
supported by the aggregate analysis in the present study. However, the differences 
between and within Svealand and Gotaland are too large for grouping the dialects 
in these areas into one class based on vowel pronunciation. The third type identified 
by Schaeffler comprised the mainland Finland-Swedish varieties, which clearly form 
a separate group also in the present study. The dialects on Aland belong to the 
northern type in Schaeffler’s study. The present thesis also shows that the varieties 
on Aland share more vowel features with varieties in Sweden than with the Finland- 
Swedish dialects, but in contrast to Schaeffler’s results, Aland is more similar to 
Uppland than to the dialects in Norrland when it comes to vowel pronunciation. 

Hopefully more analyses of additional linguistic levels will be carried out in the 
future using data from the SweDia database. Quantitative comparisons of data from 
different linguistic levels could show interesting interactions and form the basis for 
linguistic typologies. 


8.2.2 Change and leveling 


The comparisons of the vowel pronunciation of older and younger speakers in this 
thesis (§ 6.2.2 and § 7.2.2) suggest large-scale dialect leveling; the linguistic distances 
between sites are shorter for younger speakers than for older speakers. The aggregate 
analysis revealed that the sites that show the largest amount of change are many of 
the most central ones close to the biggest cities, while many of the peripheral dialect 
areas, which are most divergent from Standard Swedish, are relatively stable when 
it comes to vowel pronunciation. 
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The aggregate analysis as such does not provide an explanation for why the 
central dialects are changing the most. In order to find an answer, the variation and 
change on the variable level has to be studied. The analysis in § 6.2.2 showed that 
the vowels responsible for the largest amount of change are the vowels in the words 
lar, nat, lds, dér, lett and sét. The analysis of each of the vowels in § 6.1 showed 
that the patterns of change look quite different for some of these vowels. 

Even though one neutral variety of Standard Swedish hardly exists, there are still 
variants that are more or less associated with a Standard Swedish pronunciation. 
An example is the open pronunciation of the phonemes /e:/ and /g:/ before /r/, 
which is associated with Central Standard Swedish pronunciation (Bruce, 2010, 118, 
Grénberg, 2004, 139). The open variants of /e:/ and /g:/, that is [ae:] and [oe:], were 
elicited with the words ldr and dér in the present study. The analyses showed that 
older speakers in large parts of Sweden have a much more close pronunciation than 
the Standard Swedish one. An open pronunciation corresponding to the Standard 
Swedish one is found among older speakers mainly in eastern parts of the language 
area, for example in the surroundings of Stockholm where the open pronunciation 
of these vowels has been a part of the rural dialects. Among younger speakers the 
open pronunciation has become much more widespread, so for these two vowels there 
seems to be a clear case of convergence to Standard Swedish. 

For the vowel in ndt the pattern of change is different. The traditional pro- 
nunciation in Stockholm and the surrounding dialects has been [e:], while the pro- 
nunciation generally accepted as Standard Swedish and the pronunciation in most 
dialectal varieties of Swedish is [e:]. The Stockholm pronunciation is the result of a 
merger of the phonemes /e:/ and /e:/, which has never been accepted as Standard 
Swedish (Elert, 2000, 46). However, the merger is not a complete one, since in front 
of /r/ the two vowels are kept apart also in Stockholm and surrounding dialects.! 
For the ndét vowel the present data set shows that older speakers around Stockholm 
have a much more close pronunciation than most other varieties of Swedish (except 
from varieties on Gotland and in Finland). In the younger generation the sites close 
to Stockholm have a more open pronunciation than in the older generation. For 
the nat vowel, there is a change towards Standard Swedish, but the sites that are 
changing the most and giving way for the Standard Swedish pronunciation are the 
ones closest to Stockholm. At the same time the pronunciation of the ndt vowel 
also seems to be becoming even more open in many parts of the language area, 
approaching [a:]. 

For the vowels in sdé¢ and és yet another geographic pattern of change is found. 
In both these words the Standard Swedish vowel phoneme is /¢:/.? The data shows 
that while the older speakers at most sites have a relatively close pronunciation 
of the sét vowel, younger speakers in a large east central area have a more open 
pronunciation. A change towards a more open vowel is also found on the west 


'There is a distinction between /le:ra/ [le:ra] ‘clay’ and /le:ra/ [lee:ra] ‘learn’, but not between 
/le:ka/ [lezka] ‘play’ and /le:ka/ [lezka] ‘heal’. 

?Some dialects have preserved a Proto-Nordic diphthong in Jés, and in these dialects the words 
sét and lds have different vowel phonemes. 
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coast and even among the speakers who represent Standard Swedish in the SweDia 
database. 

The vowel in lett, for which a large degree of change was noted in § 6.2.2, too, 
seems to show a similar change in the whole language area. At almost all sites in 
the data set the pronunciation is more open among younger speakers than among 
older speakers. 

In summary, the vowels in ldr, dér and nat all show convergence to Standard 
Swedish. The dialect area surrounding Stockholm has traditionally had a standard- 
like pronunciation of ldr and dér and therefore does not show much change in these 
two vowels. In nat the dialects close to Stockholm have a more close pronunciation 
than the standard pronunciation among older speakers, but young speakers seem to 
converge towards the Standard Swedish pronunciation. At the same time a change is 
found in the vowels in ndt, sét and lett, which cannot be seen as a convergence to the 
standard language but it is a relatively new feature in Swedish. These vowels have a 
more open pronunciation among younger speakers than among older speakers. The 
vowels in nat and lett seem to become more open in the Swedish dialects in general. 
For the vowel in sét¢ (for most varieties of Swedish equal to the vowel in Ids) the 
change towards a more open pronunciation is strongest in central Sweden and on 
the west coast, while some more peripheral areas are less affected by the change. 

There is hence a complex situation lying behind the maps in Figure 7.9 (Chapter 7, 
p. 141), which show that dialects in central Sweden and on the west coast are the 
ones that are changing the most, while several peripheral dialects seem more stable. 
Convergence to the standard language partly explains the change, but the diffusion 
of a new more open pronunciation of some of the front vowels among young speak- 
ers, which is not a change towards what traditionally has been considered Standard 
Swedish, also explains a large part of the change. 

The lowering of /e:/ and /g:/ has been noted by scholars before. Especially the 
local vernacular of Eskilstuna has been the subject to many studies, and the lowering 
of /e:/ and /¢:/ in Eskilstuna has been described by Nordberg (1975), Hammermo 
(1989) and Aniansson (1996). Kotsinas (1994) has described the use of the more 
open variants of these vowels among teenagers in Stockholm, and Andersson (1994) 
noted the spread of an open /g:/ in Géteborg. Grénberg (2004) considered the open 
variant of /¢:/ as a marker of general Swedish youth language and particularly of 
the vernaculars of the cities. In Grénberg’s study of teenagers in Vastergétland the 
frequency of open /¢:/ was generally low, but the frequency grew higher the closer 
the subjects lived to G6éteborg, which indicated diffusion from the city. 


8.2.2.1 Diachronic view 


In § 6.1.11, Nordberg’s (1975) explanation for the emergence of a more open pro- 
nunciation of /g:/ in Eskilstuna was described. Initially it was a “socio-linguistic 
hypercorrection” that occurred among lower-class speakers when the speakers, who 
in their own dialect only had a close variant of /9:/, wanted to imitate the more 
open pronunciation [oe:] used before /r/ in Standard Swedish. The hypercorrection 
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Table 8.1. Older Swedish vowel system with ten long vowels. 


front central | back 
—round -+round 
close i y u u 
mid e a) fe) 
open € oe a 


Table 8.2. The Swedish long vowel system after /oe:/ had merged with /o:/. 


front central | back 
—round +round 
close i y a u 
mid e @ fe) 
open € a 


occurred when the dialect speakers did not restrict themselves to using the open 
variant only before /r/ but in all contexts. In a second stage of the change, the open 
pronunciation in other than pre-/r/ context diffused to speakers of all social-classes 
in a change from below (Nordberg, 1975), and some decades later entered Stockholm 
youth language (Kotsinas, 1994). 

Nordberg (1975) goes one step further back in the language history in order to 
explain the development of the more open pronunciation of /g:/. Until the mid 
18th century Swedish had ten long vowel phonemes, which formed the symmetrical 
phonological system in Table 8.1. There was a separate phoneme /ce:/ which in the 
middle of the 18th century merged with /o:/. The disappearance of the phoneme 
/ce:/ left a hole in the phonological system, as can be seen in Table 8.2. 

The hole left in the Swedish long vowel system by the disappearance of /ce:/ 
was filled by the emergence of a more open allophone of /g:/, which had until 
then been one uniform sound without any allophones in complementary distribution 
(Nordberg, 1975). In the latter half of the 18th century /9:/ was lowered before /r/ 
in the spoken language of the higher social classes in Central Sweden. A similar 
allophonic variation is found for /e:/ (/e:/ — [ee:] / __/r/). To fit in the pre-/r/ 
variants of both /g:/ and /e:/ in the vowel system a fourth degree of vowel height 
is needed for front vowels. 

The Standard Swedish vowel system with nine long vowels and pre-/r/ variants 
of /g:/ and /e:/ has been difficult to describe phonologically, as already explained 
in § 2.3.1. One reason for this is that sometime during the 19th century /w:/ lost 
its central position in Table 8.1 and became a phonetically front vowel, which made 
the close front part of the vowel system very crowded. Articulatory and acoustically 
/#:/ was distinguished from /y:/ and /¢:/ by another type of rounding; /#:/ has 
been described as in-rounded while /y:/ and /g:/ are out-rounded. Constructing 
a simple symmetric phonological system with distinctive features was impossible 
taking phonetic facts and phonological variations into account. Table 8.3 shows two 
examples of distinctive features for Swedish long vowels that have been proposed. 

In many Swedish dialects the phoneme /ce:/ was preserved much longer than 
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Table 8.3. Two examples of the numerous different structural descriptions of the 
Swedish long vowels that have been proposed (allophonic variants are not included). 


Malmberg (1956): 


Traunmiiller & Ohrstrém (2007): 


front back front back 
roundg round; round, roundg round; round, 
close i y u i y a u 
mid e a) a fe) e @ e) 
open € a € a 


Table 8.4. Long vowel system in Eskilstuna (Nordberg, 1975). 


front back 
—round -+round 
close i y u 
mid e a re) 
open € @ a 


in the standard language. In these dialects there was no room for a lowering of 
/@:/ before /r/, because /ce:/ occupied the place in question in the vowel space. 
Therefore the pronunciation of /9:/ remained more close than in Standard Swedish 
in all phonetic contexts in many dialects. 

A diachronic study by Nordberg (1975) shows that the lowering of /g:/ in Es- 
kilstuna started after /ce:/ had disappeared as a separate phoneme, which happened 
later in dialects surrounding Eskilstuna than in the standard language spoken in 
Stockholm. The lowering started in pre-/r/ context, and after that the more open 
pronunciation started to occur in all positions. Due to “socio-linguistic hypercor- 
rection” the pronunciation of /¢:/ in Eskilstuna developed a step further than in 
Standard Swedish and the pronunciation became [ce:] in all positions instead of 
maintaining two allophones in complementary distribution. In Eskilstuna after /:/ 
had been lowered, /#:/ was lowered, too, in a classical drag chain. In addition /w:/ 
became less labialized and started to sound more like the original /9:/. Through 
this drag chain the phoneme system, which had become asymmetrical by the loss of 
/oe:/, became symmetrical again, as can be seen in Table 8.4. 

The role of /e:/ in this chain shift has been less well described. Standard Swedish 
has two allophones of /e:/, but in contrast to /g:/ there has not been any loss of 
an unrounded open front vowel which would have left room for allophonic variation 
of /e:/ in the phoneme system. An approximate date for the emergence of the two 
allophones of /e:/ is hard to find in the literature. 

Nordberg (1975) mentions that in Eskilstuna the variation across age groups and 
social groups in /e:/ seemed to be similar as the one found for /g:/. That is, the 
highest social group followed the Standard Swedish norm by using a close variant 
and a more open pre-/r/ variant. In the lower social group younger speakers used 
[ee:] in all context, while older speakers used [e:] in all contexts. The co-variation 
of /e:/ and /o:/ in Eskilstuna was shown quantitatively in an factor analysis by 
Hammermo (1989). 
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old young 


Figure 8.1. Euclidean distance between the vowels in nat and ldér measured with 
two PCs at 25% of the vowel duration. Green: distance = 0. 


8.2.2.2 Restructuring of the phoneme system 


A word which would include the older phoneme /ce:/ (which might still be preserved 
in some dialects) is unfortunately not included in the data set of this thesis. There- 
fore it is not possible to use the present data set for evaluating to what extent the 
emergence of a more open pronunciation of /g:/ is a consequence of the disappear- 
ance of /ce:/. Historical data provides some background. For example, the works 
of Gotlind & Landtmanson (1940-50, Vol. 1) and Landtmanson (1952) show that 
a phoneme /ce:/ has been present in the dialects in Vastergotland, where a close 
pronunciation of /g:/ was found in this thesis. Grénberg (2004, 114-115) mentions 
that the diffusion of the open pronunciation of /¢:/ might have been slowed down 
in Vasterg6tland by the fact that the phoneme /ce:/ has been considered a negative 
dialect. marker that people have wanted to avoid. 

In the present data set it is interesting to explore which varieties show allophonic 
variation in /e:/ and /g:/ with more open variants occurring before /r/. Figure 8.1 
shows the acoustic distance between the vowels in ndt and ldr, and Figure 8.2 the 
distance between the vowels in sét and dér for older and younger speakers at each 
site as measured near the onset of the vowels.? Green means that the distance is 


3 The distance is measured as the Euclidean distance of PCl and PC2 as measured at 25% of the 
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old young 


Figure 8.2. Euclidean distance between the vowels in sdt and dér measured with 
two PCs at 25% of the vowel duration. Green: distance = 0. 


0, while magenta indicates a large distance. The six older and six younger speakers 
that are considered speakers of Standard Swedish are included in the maps as rotated 
squares above left. 

The older speakers of Standard Swedish have a moderate distance between the 
pre-/r/ variants and the neutral variants. The difference is somewhat larger for /9:/ 
than for /e:/. These six older standard speakers can be considered to represent what 
has been regarded as standard pronunciation of these vowels. 

It is clear that most dialects have a smaller distance between /e:/ and /g:/ and 
their respective pre-/r/ variants than the older speakers of Standard Swedish have, 
which suggests a vowel system without allophonic variants. 

The largest distances are found among older speakers in Uppland, Gotland and 
Finland (except for Aland and Houtskar). In Uppland there is a drastic change 
between older and younger speakers. For both /e:/ and /@:/ the distance between 
the variants of the vowels in the two allophonic contexts is much smaller for younger 


total vowel duration, which makes the distance equal to the difference in color that can be observed 
when comparing the maps of vowel quality close to onset of the two vowels in Appendix C. The 
picture of the distance between the vowels in sét and dér is complicated somewhat by the fact that 
dér has preserved a Proto-Nordic diphthong in some varieties. In these varieties, found mainly in 
Norrland and Finland, the measure is not a direct measure of /g:/ in different contexts. 
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speakers than for older speakers. On Gotland and in Finland the distance between 
the two allophones of /¢:/ seems to decrease more than for /e:/. 

In an area south-west of Stockholm comprising the provinces Sédermanland, 
Narke and Ostergétland, both older and younger speakers have a small distance 
between the pre-/r/ variants and the neutral variants. But a comparison with the 
maps of each vowel in Appendix C still shows a big difference between older and 
younger speakers. Younger speakers have an open pronunciation in both contexts 
while older speakers have a close pronunciation in both contexts. The same seems 
to be the case for sites on the west coast, close to Géteborg. In these areas there 
seems to be little allophony, but there is a shift from more close front vowels to more 
open ones. 

In a western area around lake Vanern and close to the Norwegian border, relat- 
ively close pronunciations of /e:/ and /g:/ seem to be common in both phonological 
contexts, and the shift between older and younger speakers is not that big. 

In Smaland there is an area where the older speakers have a close pronunciation of 
the vowels, but where the younger speakers have shifted towards a more standard-like 
system with a larger distance between the pre-/r/ variants and the other variants. 

The distance between the two allophones of both phonemes is smaller for the 
younger Standard Swedish speakers than for the older. Nordberg (1975, 602) pre- 
dicted that the situation with two allophones of /g:/ in Standard Swedish was only 
a temporary stage, which was the result of the disappearance of the phoneme /ce:/. 
This stage had, according to Nordberg (1975), been maintained for a quite long 
period because it happened to represent a prestigious standard norm. In Eskilstuna, 
on the contrary, the development from a close vowel /¢:/ to a more open pronunci- 
ation in all phonetic contexts happened relatively fast. The present data set shows 
that the difference between pre-/r/ variants and /g:/ and /e:/ in other contexts 
seems to be decreasing in Standard Swedish. 

In many dialects the system with allophones in complementary distribution of 
these two vowels has not existed. In some of these areas all speakers of the present 
study still use close variants of these vowels; in others the younger speakers have 
adopted the more open pronunciation in all contexts. The close pronunciation seems 
to be persistent especially in the provinces Vastergétland and Dalsland. Groénberg 
(2004, 344) has proposed that the close pronunciation might continue to be a part of 
a West Swedish regional standard also in the future, since this feature seems to be 
preserved in the dialects while other dialectal features are disappearing rapidly. In 
Smaland there is an area where younger speakers make a larger distinction between 
the allophones than the older speakers do. Hence, the younger speakers in Sma- 
land have oriented themselves towards what, at least, used to be seen as Standard 
Swedish. Unless other linguistic or extra-linguistic factors change the development, 
one could predict that the next generation of speakers in Smaland will use the open 
variants in all contexts. 

In order to see the effect of the lowering of front vowels on the whole long vowel 
system, Figure 8.3 repeats Figure 6.1 (p. 87) but with older and younger speakers 
separated. One should keep in mind that the ellipses include vowels from many 
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Long vowels, younger speakers 


A 


iS 
®) 
4 


7 


Figure 8.3. The 19 vowels of older and younger speakers in the PC2/PC1 plane. 
The one standard deviation ellipses are drawn based on the average PC values of the 
two speaker groups at each site measured at the temporal midpoint of each vowel. 
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different varieties of Swedish and that these varieties differ from each other both sub- 
phonemically and phonemically. Still, the general trend can be seen very clearly: 
except from the leveling of dialects (smaller ellipses with less overlap for younger 
speakers) there is a general lowering of front vowels going on. Especially the vowels 
in dér, lar, lds, ndt and sét are being lowered and are thereby filling a place in 
the vowel space that was not previously filled by any Standard Swedish long vowel 
phoneme. 

Not all Standard Swedish short vowels are represented in the data set, which 
makes the picture of short vowels incomplete. Of the short vowels included in the 
data set the vowel in lett shows the most lowering in Figure 8.3. 

Auer (1998) discusses the relationship between endogenous (“natural”) and con- 
tact-induced changes in dialect-standard language settings. In addition to direct 
convergence to standard, dialects sometimes converge to the standard language or 
to each other as a consequence of internal restructuring and innovation. The in- 
novations can at the same time be triggered by dialect contact. Regional dialect 
leveling can share features with koineization, especially with respect to simplifica- 
tion (Kerswill, 2002, 672). Simplification can involve increase in regularity, decrease 
in markedness or the loss of categories (Kerswill, 2002, 671). A loss of allophonic 
variants of /¢:/ and /e:/ means a simplification of the Standard Swedish vowel sys- 
tem in the sense that the vowel inventory becomes smaller. It can also be seen as 
a removal of marked forms, since the allophonic variants have been used only in 
a small part of the whole language area and can therefore be considered marked 
(Trudgill, 1986, 98). In the data set the dialects in Uppland show simplification by 
the loss of allophonic variants, and, in addition, there seems to be a demerger of /e:/ 
and /e:/ going on in Uppland. The marked situation where the distinction between 
/e:/ and /e:/ was maintained only before /r/ would hence be solved. 

For the dialects which have not had any allophonic variants of /¢:/ and /e:/ the 
introduction of the open pronunciation means in a sense a convergence to Standard 
Swedish, because that is the variety that the open variants have been associated 
with. But at the same time the dialects keep their original internal structure by not 
introducing allophonic variation but by lowering the vowels in all contexts. 

For the chain shift described by Nordberg (1975) to be complete /t:/ should be 
lowered, too. Figure 8.3 shows a decrease in the dialectal variation in lus particularly 
on PC1, but no large-scale lowering in general. However, in the analysis of the vowel 
in lus in § 6.1.7 a lowering was noted especially in Svealand and western parts of 
Norrland. In other areas a more open pronunciation was found to be common in 
both generations of speakers. 

The chronology of the chain shift is debatable. Is /1:/, which has been considered 
the most problematic vowel of the Swedish vowel system, pushing the chain, or is the 
chain being dragged by the hole left when /ce:/ merged with /o:/? The mechanisms 
might be different in different parts of the language area depending on the phoneme 
systems of the local rural dialects. 

For Standard Swedish the vowel shift certainly means that a phoneme system 
which has been very difficult to describe structurally is being simplified, as the vowel 
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inventory becomes both smaller and more symmetrical. This simplification can be 
interpreted as contact-induced. As Linell (1973, 12) has pointed out: the vowels 
that have presented a problem for the phonological description of Standard Swedish 
are the same that show considerable variation across Swedish dialects. That is most 
likely no coincidence. Nordberg (1975, 602) predicted that Standard Swedish had 
frozen in a temporary stage of a vowel shift. Due to dialect contact the development 
in Standard Swedish now seems to be proceeding into the next stage. This opening 
up for a change in Standard Swedish might be connected to the attitudinal change 
towards linguistic variation in public language that has been noted since the 1970s 
(see § 2.1.1). The symmetric structural description of the Swedish vowel system 
in Table 2.2 (p. 19), identical to the one Nordberg (1975) proposed for Eskilstuna 
(Table 8.4), which did not correspond to articulatory and acoustic facts in Standard 
Swedish a few decades ago, may correspond better to young people’s speech today. 

The dialect leveling process that the Swedish dialects are involved in shows com- 
plex mechanisms of convergence to Standard Swedish with simultaneous restructur- 
ing towards a more “natural” vowel system. In linguistic changes involving innov- 
ations it is not uncommon to find that innovations diffusing from the center reach 
peripheral parts much later (if at all). This is exactly the pattern that can be ob- 
served in the maps in Figure 7.9 (p. 141). Areas that do not show much aggregate 
change in vowel pronunciation are for example Skane, Gotland and the Swedish dia- 
lect area in Finland. Edlund (2003, 28) has pointed out that Skane and Gotland 
are Swedish regions with a strong regional identity. The identity is enhanced by an 
awareness of the historical developments that have formed these areas (Skane was 
part of Denmark for a long time, while in medieval times Gotland was independent 
from Sweden and had an important position in the Hanseatic League), and by local 
traditions and cultural heritage. The province of Skane has its own flag and there is 
a strong separatist movement. The Swedish language areas in Finland are separated 
from the rest of the Swedish dialects not only by the sea and a different political 
history, but also by a national border. Local identity is manifested through language 
use, and a strong identity serves to preserve characteristic features in the language. 
The regional varieties spoken in the three mentioned areas do not have a symbolic 
value only for the speakers themselves in these areas, but these varieties are also 
easily recognizable to other Swedes. 


8.3. Analysis of variables vs. aggregate analysis 


In the previous chapters dialectal variation in vowel pronunciation has been analyzed 
both on the variable level and by using an aggregating technique. The former ap- 
proach is the one that has traditionally been used by dialectologists, while the latter 
one is the one preferred in dialectometry. To what extent can these two approaches 
supplement each other, and to what extent are they redundant? I think that the 
discussion above about changes in Swedish vowel pronunciation has shown that both 
approaches are needed for an exhaustive view of dialectal variation. 
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The aggregate analysis by means of multidimensional scaling (MDS) showed 
that even if the distribution of the separate linguistic variables is gradual, taking all 
available information into account some more coherent dialect areas can be detected. 
Techniques like MDS allow for great quantities of data to be taken into account (even 
if the number of linguistic variables in this particular study was not very large, 
restricted to 19 vowels). The analysis of large amounts of data with computational 
techniques makes it possible to detect relationships which no dialectologist could 
identify with manual methods. 

Data-driven analysis of large amount of data reduces the influence of subjective 
choices of the researcher. Nonetheless, the data given as input to the analysis will of 
course determine the outcome, and data for quantitative analyses should be chosen 
carefully. A very frequent feature in the data will naturally explain most of the 
variance and hence come out as the most important factor. In the present study 
this was shown by the fact that the first dimension of the MDS as well as the 
first factor of the factor analysis (FA) were largely determined by a spectral feature 
which influenced all front vowels in the data set in a similar way. This spectral 
feature was not connected to the articulation of specific vowels, but was assumed 
to be related to voice quality differences which would influence the speech signal as 
a whole. Because the feature is present in many vowels it will also explain most of 
the variance in the data. Whether an omnipresent spectral feature is also a salient 
feature in perception of dialectal differences, and if it should be considered the most 
important factor when, for example, making a dialect division, can be discussed. 

Andersson (2007, 40-41) discusses the fact that the linguist’s view of a dialect 
is usually a set of linguistic details, while laymen generally have a holistic view of 
how a specific dialect sounds without necessarily any idea of how specific vowels or 
consonants are being pronounced. One could say that the aggregate dialectometric 
analysis models the layman’s view. The relationships between varieties are studied 
by analyzing all the available linguistic features as a whole, and the view of the 
relationships between varieties in a dialectometric analysis does not include any 
detailed description of linguistic features. Dialectologists are generally still also 
interested in describing dialect areas linguistically and finding causal relationships 
for the observed distribution patterns. This information is not directly available in 
the results of an aggregate analysis like MDS. 

The analysis on the variable level in Chapter 6 of the present thesis has shown 
the variation in separate variables as well as the presence of co-occurring features. 
Visualization of the geographic distribution of specific variables (§ 6.1) can be com- 
pared to the isoglosses of traditional dialectology, while the analysis of co-occurring 
features by means of FA (§ 6.3) can be compared to isogloss bundles. But contrary 
to drawing isoglosses FA is completely data-driven with automatic recognition of 
co-occurring features. The use of a numeric measure of vowel quality made it pos- 
sible to visualize not only abrupt linguistic borders, but also continua and gradual 
borders in the data as well as non-continuous areas. 

By reducing the data into a fewer number of dimensions than FA, MDS cannot 
give an account for all different underlying linguistic distribution patterns. On the 
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other hand, sites with some amount of missing data could be included in the MDS 
but not in the FA. The aggregate relationships between dialects can be studied even 
if a few variables are unknown for some varieties. The differences and similarities 
between FA and MDS was shown by a correlation of the results of the two methods 
in § 7.2.3. 

The second factor of the FA showed the co-occurrence of a number of variables 
which characterize the South Swedish varieties—particularly diphthongization of 
long close vowels. This factor explained 9.1% of the total variance in the data. By 
MDS the effect of these variables was distributed over all three dimensions of the 
analysis, which means that any direct conclusions about specific variables distin- 
guishing South Swedish varieties could not be made with only the results from MDS 
at hand. 

The third dimension of the MDS could be interpreted as a peripherality dimen- 
sion. This dimension correlated significantly with a number of factors from the FA 
which showed different distribution patterns, all being mostly related to some more 
peripheral dialects. Variables related to these factors were, for example, different 
kinds of diphthongization. In MDS, these different distribution patterns were not 
separated, but the effect of these variables, which explain a relatively small amount 
of the total variance in the data, were joined into one dimension which distinguished 
peripheral more divergent dialects. Varieties grouped together by the third dimen- 
sion of the MDS hence share the fact that they all have a large linguistic distance 
to more central varieties, rather than that they would be linguistically very similar 
to each other. 

The sixth factor of the FA did not correlate significantly with any of the dimen- 
sions of the MDS. The sixth factor identified a few variables which show similar 
geographic and generational distribution in the data but for which the differences 
across varieties are small. These differences were too small to be counted in heavily 
in an aggregate analysis. The reason that these variables turned up as a more signi- 
ficant factor in FA is probably that the FA was built on a correlation matrix, which 
makes all original variables count equally despite different ranges of the values on 
the original variables. If a variance-covariance matrix would have been used instead 
of a correlation matrix in the FA, these variables would probably have counted less. 

A kind of opposite effect was found, too, where an object which was not detected 
to deviate strongly from other varieties in the analysis of the separate variables 
was pointed out as an outlier by the MDS. The younger speakers in Léderup were 
assigned extremely high values in the second dimension of the MDS. Even if this 
group of speakers did not deviate heavily from surrounding varieties concerning the 
separate variables, there must be an accumulated effect of several small differences 
which makes this object an outlier in the MDS. 

Even if multivariate statistical analyses helps the researcher to find regularities 
in complex data, the discussion above in § 8.2.2 has shown that in order to explain 
ongoing linguistic changes, relationships that are not revealed by either MDS or FA 
have to be taken into account. Language change can be the result of very complex 
interactions between intra-linguistic and extra-linguistic factors. Analyses that take 
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a number of sociolinguistic variables as well as language historical developments into 
account can help to show what the mechanisms behind a language change are. 

The analysis of gender- and age-related variation in § 7.2.4 showed that there is 
a larger difference in vowel pronunciation at the aggregate level between women and 
men in the older generation than in the younger generation. The difference showed 
up in the second dimension of the MDS, which is also the main separator of older 
and younger speakers. Since the older men were assigned more similar values to 
the younger speakers compared to the older women, a conclusion on a superficial 
level could be that the older women are more conservative than the older men. To 
be able to make any conclusions like that, one would, however, need to know more 
about which variables are the ones explaining the difference between older women 
and older men, and to analyze these variables in a sociolinguistic context. 


Chapter 9 
Summary and conclusions 


The aim of this thesis was to study dialectal variation in Swedish vowel pronunci- 
ation. The Swedish dialects have undergone massive leveling during the 20th century. 
Many features of the traditional rural dialects have been lost, and local dialects have 
been replaced by regional dialects and regional varieties of Standard Swedish. Vowel 
pronunciation is considered one of the linguistic levels where considerable regional 
variation across the Swedish language area is still found. 

The data for this study come from the SweDia dialect database. The database 
includes data recorded at more than one hundred rural sites in Sweden and the 
Swedish speaking parts of Finland around year 2000. For this thesis data from 
98 sites was analyzed. At each site approximately twelve speakers were recorded: 
three older women, three older men, three younger women and three younger men. 
The older speakers where approximately 55-75 years old and the younger speakers 
between 20 and 35. 

The vowel data comprises isolated words where the vowels are in a C__C context. 
Most of the words are monosyllabic, and in the disyllabic words the vowel analyzed is 
the one in the stressed position, which should assure maximal differentiation between 
vowel classes. Words with coronal consonants were chosen for eliciting the vowels to 
minimize different co-articulation effects in different vowels. 

Nineteen vowels were analyzed. This set of vowels includes all of the Standard 
Swedish long vowel phonemes. Of the Standard Swedish short vowels four (/u/, 
/e/, /¢/ and /@/) are missing, because they had not been consistently elicited for 
the SweDia database. In addition to the Standard Swedish phonemes, allophonic 
variants were included and a few vowels reflecting Proto-Nordic diphthongs. Vowels 
vary not only sub-phonemically, but also phonemically across Swedish dialects. Since 
no automatic way of doing a phonemic analysis of a large number of varieties exists, 
this thesis was restricted to analyzing phonetic variation only. For this reason, the 
vowels might represent different phonemes categories in the varieties that are being 
compared. Phonemic variation can be expected especially in areas where dialect 
leveling has not been very strong and traditional rural dialects have been preserved. 
Because phonemic variation has not been taken into account in this thesis the vowels 
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are being referred to by the word that was used for elicitation instead of referring to 
phoneme categories; for example, the lat vowel instead of /a:/. 

The vowels were analyzed acoustically by means of principal component analysis 
of Bark-filtered vowel spectra. The two extracted principal components (PCs) can 
be interpreted roughly in terms of vowel height (PC1) and advancement (PC2). A 
correlation with formant measurements of a subset of the data showed high correla- 
tions. The correlation between PC2 and F2 is somewhat weaker than between PC1 
and F1. 

Diphthongization was taken into account by either letting one sample near the 
onset (at 25% of the duration) and one sample near the offset (at 75%) represent 
each vowel pronunciation (§§ 6.1 and 6.3), or by using nine continuous sampling 
points of every segment (§ 6.2 and Chapter 7). Only vowel quality was analyzed, 
not vowel length. Differences in how phonological quantity of vowels and consonants 
is realized in Swedish dialects has been studied previously. Schaeffler (2005) made a 
typology of Swedish dialects based on quantity using data from the same database 
as the one employed in the present study. 

Dialectal variation was studied both in each vowel and on an aggregate level. 
Both methods contributed to the understanding of the dialectal variation and were 
shown to complement each other. 

The analysis on the variable level showed that the two most variable vowels across 
sites in the data set were the vowels in dér and sot while the vowels that showed 
the least degree of geographic variation were the vowels in sdrk, lass and disk. Long 
vowels showed considerably more variation than short vowels. Most of the vowels 
analyzed were more variable across sites in the older generation of speakers than in 
the younger generation. Only the vowel in sét showed considerably more variation 
among younger speakers than among older speakers. 

A factor analysis showed the co-occurrence of a number of vowel features. The 
first factor showed that varieties in Svealand and southern parts of Finland share 
some spectral feature which is connected to PC2 of front vowels. This feature might 
be connected to voice quality differences rather than to vowel articulation. The 
second factor identified diphthongization of long vowels in South Swedish varieties, 
which is most strongly influencing close vowels. The third factor identified that a 
number of front vowels are being pronounced more open by younger speakers than 
by older speakers. 

An aggregate analysis of the vowel data showed that the Swedish dialects form 
a linguistic continuum when it comes to vowel pronunciation and no abrupt dialect 
borders can be found, which is in line with previous literature. Within the continu- 
ous distribution of vowel features, however, some more coherent dialect areas could 
be identified. These areas coincide to a large extent with classifications that have 
previously been proposed for Swedish dialects (Wessén, 1969) and for regional vari- 
eties of Standard Swedish (Elert, 1994). The areas are also similar to those proposed 
in an intonational typology of Swedish dialects by Bruce (2004), but not so similar 
to those proposed in the typology of phonological quantity in Swedish dialects by 
Schaeffler (2005). 
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The analyses indicated a large-scale leveling of Swedish dialects. The average 
linguistic distances between sites based on vowel pronunciation were significantly 
shorter for younger speakers than for older speakers. This effect was seen clearly on 
maps visualizing the results of multidimensional scaling. But even if the linguistic 
distances between dialects are becoming smaller, the same larger geographic regions 
could be identified in the variation in vowel pronunciation of both older and younger 
speakers. 

A large linguistic distance between older and younger speakers was found in the 
province Norrbotten, where the results of the analysis on the variable level showed 
that the change can be interpreted as a strong tendency to convergence to Standard 
Swedish. While older speakers in Norrbotten make use of several diphthongs, both 
ones that originate from Proto-Nordic and ones that are a result of later develop- 
ments, most younger speakers use only monophthongs, which corresponds to the 
Standard Swedish vowel system. 

In other peripheral areas that are also characterized by a rich amount of diph- 
thongs, the tendency to convergence to Standard Swedish is not as strong. This is 
the case for the island Gotland, the south of Sweden and some varieties in Finland. 

In central Sweden, around and south of Stockholm and close to Goteborg, the 
aggregate distance in vowel pronunciation between older and younger speakers is 
large. The ongoing change in vowel pronunciation in central Sweden could be con- 
nected to an ongoing chain shift in front mid-vowels (described in detail in § 8.2.2). 
In Standard Swedish the vowels /e:/ and /9:/ have been characterized by allophonic 
variation with more open variants ([z:], [oe:]) being used before /r/. In the dialects, 
this allophony has been found mainly in the east, while western dialects have not had 
any allophony of these two vowels, but the pronunciation has been close in all con- 
texts. The new pronunciation that many younger speakers in central Sweden have 
does not correspond to any of the two mentioned systems, but an open pronunci- 
ation of the two vowels, corresponding to the one found only before /r/ in Standard 
Swedish, is used in all contexts. This diffusion of more open variants of /e:/ and /¢:/ 
in central Sweden has been noted by scholars previously. In this thesis the lowering 
could be confirmed acoustically for a large central Swedish area. Also for younger 
speakers representing Standard Swedish in the data set a lowering, which leads to 
reduced phonetic distinction between the allophonic variants, was detected. 

An additional long vowel involved in the chain shift is the vowel in lus (Stand- 
ard Swedish /#:/). The phonetic character of this vowel has complicated structural 
descriptions of the Standard Swedish vowel system. The vowel in Jus shows consid- 
erable variation in the acoustic measures across the dialects involved in this study. 
A lowering of /#:/ in combination with the loss of allophonic variants of /e:/ and 
/9:/ in Standard Swedish would lead to a smaller vowel inventory and a more sym- 
metric vowel system, and can hence be interpreted as simplification. Simplification 
often occurs as a result of dialect contact in regional dialect leveling. When it comes 
to the ongoing change in Swedish front vowels many dialects show convergence to 
Standard Swedish by adopting the open variants of /e:/ and /9:/, which have been 
associated with Standard Swedish pronunciation. But at the same time the system 
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without allophonic variation, which has been characterizing for many dialects but 
not for Standard Swedish, is maintained. These results suggest that the interaction 
between dialects and Standard Swedish has led to a development where the Standard 
Swedish vowel system becomes more simplified in the course of time. 
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Nederlandse samenvatting 


Er is veel variatie in de uitspraak van klinkers zowel in Zweedse dialecten als regionale 
variéteiten van het Standaardzweeds. Hoewel dit al lang bekend is, hebben taalkun- 
digen weinig akoestisch onderzoek verricht naar dialecten uit het hele Zweedse taal- 
gebied. Het doel van dit proefschrift was de geografische variatie in klinkeruitspraak 
in het Zweeds te beschrijven. De onderzoeksvragen waren als volgt: 


e Hoe ziet de geografische verdeling van dialectale kenmerken in klinkeruitspraak 
er uit? 


e Zijn er klinkerkenmerken die dezelfde variatie vertonen? 


e Hoe groot is de dialectale variatie en in hoeverre worden lokale dialecten nog 
gesproken? 


e Welke dialecten zijn aan het veranderen? Welke dialecten zijn stabiel? 

e Hoe groot is de afstand in klinkeruitspraak tussen oudere en jongere sprekers? 
e Welke klinkers zijn aan het veranderen en hoe? 

e Zijn er verschillen in klinkeruitspraak tussen mannen en vrouwen? 


e Hoe kunnen de Zweedse dialecten ingedeeld worden op basis van klinkeruit- 
spraak? 


e Komt de indeling van hedendaagse variéteiten van het Zweeds overeen met de 
traditionele dialectgebieden? 


e Komt een indeling van de dialecten op basis van klinkeruitspraak overeen met 
indelingen op basis van andere taalkundige niveaus? 


Voor dit onderzoek is gebruik gemaakt van gegevens uit de SweDia! database. Het 
materiaal is verzameld in de jaren 1998-2001 en bevat informatie over 98 plaatsen in 
het Zweedse dialectgebied. In elke plaats zijn opnames gemaakt van zowel mannelij- 
ke als vrouwelijke sprekers in twee leeftijdscategorieén. Vrijwel alle oudere sprekers 
waren tussen de 55-75 jaar toen de opnames werden gemaakt. De jongere sprekers 


1 <http://swedia.ling.gu.se/> 
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waren tussen de 20-35 jaar ten tijde van de opnames. Op de meeste plaatsen zijn 
twaalf sprekers geinterviewd: drie jonge vrouwen, drie jonge mannen, drie oudere 
vrouwen en drie oudere mannen. In totaal beschikt het onderzoek over informatie 
van 1170 dialectsprekers. Daarbij komen nog twaalf sprekers die de Standaardzweed- 
se uitspraak vertegenwoordigen. 

De analyse bestaat uit negentien klinkers. De klinkers zijn geisoleerd uit de vol- 
gende Zweedse woorden (Standaardzweedse uitspraak van de klinkers tussen haak- 
jes): dis [i:] ‘nevel, waas’, disk [1] ‘balie, afwas, schijf’, ddr [ce:] ‘sterven (onvoltooid 
tegenwoordige tijd)’, dérr [oe] ‘deur’, flytta [y] ‘verhuizen, verplaatsen’, lass [a] ‘la- 
ding, vracht’, lat [a:] ‘lui’, leta [e:] ‘zoeken’, lett [e] ‘leiden (voltooid deelwoord)’, 
lott [9] ‘lot’, lus [#:] ‘luis’, las /lat [o:] ‘slot’ /‘lied’, dr [ze:] ‘leren (onvoltooid tegen- 
woordige tijd)’, lds [9:] ‘los’, nat [e:] ‘net’, sot [u:] ‘roet’, sdrk [ze] ‘nachthemd’, sdét 
[o:] ‘zoet, lief’, typ [y:] ‘type’. 

Deze woorden bevatten alle lange klinkers en de meeste korte klinkers van het 
Standaardzweeds. Elke spreker heeft de woorden drie tot vijf keer herhaald. Op alle 
plaatsen zijn dezelfde woorden gebruikt om de klinkers te eliciteren. Alleen voor het 
eliciteren van de klinker [o:] zijn twee verschillende woorden gebruikt: op sommige 
plaatsen las en op andere ldt. 

De uitspraak van de klinkers is akoestisch geanalyseerd. Voor de analyse van de 
dialectale variatie in de resultaten van de akoestische metingen zijn twee verschillende 
methoden gebruikt: 1) analyse per taalvariabele en 2) analyse van geaggregeerde 
taalkundige afstanden. 


Akoestische analyse 


De meest gebruikte methode voor het akoestisch bepalen van klinkerkwaliteit binnen 
taalvariatieonderzoek is gebaseerd op formantmetingen. In dit proefschrift is echter 
voor een andere methode gekozen. De klinkerspectra zijn gefilterd met Barkfilters 
tot 18 Bark. De Barkschaal is gebaseerd op de kritieke banden van het basilair mem- 
braan in het binnenoor, waardoor een representatie in Barkfilters goed overeenkomt 
met de menselijke waarneming van spraakklanken. Deze bandfilterrepresentatie is 
vervolgens gereduceerd tot twee articulatorisch zinvolle componenten door middel 
van hoofdcomponentenanalyse (Eng. principal component analysis, PCA). Deze me- 
thode is geintroduceerd door Plomp, Pols en Van de Geer in 1967. Jacobi heeft in 
haar proefschrift van 2009 aangetoond dat deze methode geschikt is voor de ana- 
lyse van geografische en sociale variatie van klinkers in grote datacollecties. Een 
reden hiervoor is dat PCA van bandfilterdata meer betrouwbaar geautomatiseerd 
kan worden dan formantanalyse. Geautomatiseerde formantmetingen bevatten al- 
tijd een aantal verkeerde meetwaarden die handmatig gecorrigeerd moeten worden. 
In tegenstelling tot formantanalyse kan PCA van bandfilterdata helemaal geauto- 
matiseerd worden. Niettemin is dit onderzoek ook voorafgegaan door een grote 
hoeveelheid handmatig werk. Alle klinkersegmenten zijn handmatig getranscribeerd 
en gesegmenteerd in het SweDia-project en het vervolgproject SweDat. 


Samenvatting 189 


De gekozen methode kent echter ook een aantal problemen. Omdat het hele 
klinkerspectrum geanalyseerd wordt, heeft de signaal-ruisverhouding in de opnames 
invloed op de resultaten van de PCA. De dialectopnames zijn allemaal gemaakt onder 
relatief vergelijkbare omstandigheden (meestal in een stille kamer bij de spreker 
thuis), zodat er geen grote verschillen in de signaal-ruisverhouding tussen de opnames 
zijn. De opnames van de sprekers van het Standaardzweeds zijn gemaakt in een 
studio, waardoor de signaal-ruisverhouding hoger is dan in de dialectopnames. Dit 
had een significante invloed op de scores van de PCA. 

Een groot probleem voor het akoestisch onderzoek van klinkers is interspreker- 
variatie. Dit is het gevolg van anatomisch-fysiologische verschillen in de spraak- 
organen van individuen. Zo hebben sprekers met grotere spraakorganen lagere 
formantfrequenties/PCA-waarden dan sprekers met kleinere spraakorganen. Luiste- 
raars kunnen zich onmiddellijk aanpassen aan een nieuwe spreker en normaliseren 
automatisch de verschillen in het akoestisch signaal tussen sprekers. Er zijn een 
aantal methoden voor sprekernormering in akoestische metingen ontwikkeld, maar 
de meeste van deze methoden hebben als voorwaarde dat de te onderzoeken taal- 
variéteiten vergelijkbare klinkersystemen of in ieder geval vergelijkbare hoekklinkers 
moeten hebben. Dit is niet het geval voor Zweedse dialecten en daarom was er geen 
geschikte methode voor sprekernormering die rechtstreeks toegepast kon worden. 
Een vraag in dit proefschrift was daarom in hoeverre de sprekergebonden variatie in 
de akoestische metingen te reduceren zou zijn. 

Omdat mannen doorgaans grotere spraakorganen hebben dan vrouwen, is er een 
aanzienlijk verschil in grootte van de klinkerruimte tussen mannen en vrouwen. Voor 
dit onderzoek was het erg belangrijk om de verschillen in de grootte van de klinker- 
ruimtes van mannen en vrouwen te normeren omdat het aantal mannen en vrouwen 
niet in alle opnameplaatsen even groot was. Gemiddelde akoestische metingen per 
plaats zijn gebruikt om dialecten met elkaar te vergelijken. Zonder sprekernormering 
worden deze waarden beinvloed door het aantal mannen en vrouwen in elke plaats 
(op een plaats met meer vrouwen dan mannen zouden de gemiddelde waarden van 
alle sprekers doorgaans hoger zijn dan op een plaats met meer mannen dan vrouwen). 
Verder was één van de doelstellingen van het onderzoek de taalkundige verschillen 
in klinkeruitspraak tussen mannen en vrouwen te meten. Dit is niet mogelijk als 
de metingen door anatomisch-fysiologische verschillen beinvloed worden. Deze pro- 
blemen zijn opgelost door de PCA separaat op data van mannen en vrouwen toe 
te passen. Afzonderlijke PCA’s van mannen en vrouwen hebben de verschillen in 
de akoestische metingen tussen de geslachten significant verminderd. Ook de totale 
sprekergebonden variatie is gedeeltelijk gereduceerd door de separate analyses van 
klinkers van mannen en vrouwen. 

Voor een klein deel van het materiaal (drie plaatsen) zijn formanten gemeten. 
De correlatie tussen formanten en de hoofdcomponenten laat zien dat er een sterk 
verband bestaat tussen de eerste formant (F1) en de eerste hoofdcomponent (PC1) 
(r = 0,88). Het verband tussen de tweede formant (F2) en de tweede hoofdcompo- 
nent (PC2) is iets zwakker (r = 0,73-0,74). Dit betekent dat PC2 meer dan PC1 
wordt beinvloed door andere informatie in het spectrum dan door de formanten. 
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Een meervoudige variantie-analyse (MANOVA) laat zien dat F1 en F2 in Bark 
iets beter verschillende klinkers kunnen onderscheiden dan PC1 en PC2, terwijl het 
ongewenste effect van geslacht veel hoger is voor de formantmetingen dan voor de 
hoofdcomponenten. 

PC1 heeft betrekking op klinkerhoogte en PC2 onderscheidt voorklinkers van 
achterklinkers. Deze relaties zijn bijvoorbeeld duidelijk te zien in Figuur 5.8 op 
pagina 66. 

In de analyses van dialectale en sociale variatie van klinkeruitspraak zijn gemid- 
delden van de PCs van sprekergroepen gebruikt. In de analyses zijn de sprekers 
ingedeeld op drie verschillende manieren: 


e én groep per plaats (gemiddeld twaalf sprekers in elke groep) 


e twee groepen per plaats: jonge en oudere sprekers (gemiddeld zes sprekers in 
elke groep) 


e vier groepen per plaats: jonge vrouwen, jonge mannen, oudere vrouwen en 
oudere mannen (gemiddeld drie sprekers in elke groep) 


Analyse per taalvariabele 


Een aantal analyses van de verschillende klinkers zijn gepresenteerd in hoofdstuk 6. 
In dit hoofdstuk zijn alle analyses gebaseerd op een indeling van de sprekers in twee 
groepen per plaats, d.w.z. jonge en oudere sprekers. In elke groep zijn dus gemiddeld 
drie vrouwen en drie mannen. Bij dit hoofdstuk horen de kaarten in bijlage C. De 
kaarten laten de gemiddelde PC-scores van de twee sprekergroepen per plaats zien 
voor elke klinker. 

Een vergelijking van de variatie per klinker liet zien dat lange klinkers meer vari- 
éren dan korte klinkers zowel tussen de plaatsen als tussen de twee leeftijdsgroepen. 
De klinkers in de woorden dor en sot variéren het meest tussen de plaatsen, terwijl 
de klinkers in disk, lass en sdérk de minste variatie laten zien. 

Bijna alle klinkers variéren meer tussen de plaatsen in de oudere leeftijdsgroep 
dan in de jongere leeftijdsgroep. De daling van de dialectale variatie is het grootst 
voor de klinkers in lar en lat. Alleen de uitspraak van de klinker in sét varieert meer 
in de jongere leeftijdsgroep dan in de oudere leeftijdsgroep. 

Door middel van een factoranalyse is de covariatie tussen de variabelen gemeten. 
Van de tien geéxtraheerde factoren konden enkele factoren duidelijke dialectgroepen 
identificeren, terwijl andere factoren continue geografische variatie aantoonden. 

De eerste factor is bepaald door PC2 van voorklinkers. Dialecten in Svealand 
(midden van Zweden) en Finland hebben lage scores op deze factor, terwijl dialecten 
in het zuiden en noorden van Zweden hogere scores hebben. Omdat er zo veel 
klinkers bij deze factor betrokken zijn, is het onwaarschijnlijk dat verschillen in de 
articulatie van de klinkers de variatie veroorzaken. Een veronderstelling is dat de 
achterliggende factor met stemkwaliteit te maken heeft. Het zou bijvoorbeeld kunnen 
dat er dialectale verschillen zijn met betrekking tot krakerigheid (Eng. creaky voice) 
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of dat men in een bepaald geografisch gebied zachter praat dan elders. Om erachter 
te komen wat de verschillen in stemkwaliteit precies zijn, is meer onderzoek nodig. 

De tweede factor heeft het diftongeren van lange gesloten klinkers geidentificeerd 
als dialectaal kenmerk. Dit is een karakteriserende eigenschap van Zuid-Zweedse 
dialecten. Een aantal factoren identificeert variabelen die kenmerkend zijn voor 
de dialecten op het eiland Gotland (vooral diftongen). Andere dialectgebieden die 
door meerdere factoren herkend zijn, zijn de provincies Jamtland en Norrland in het 
noorden van Zweden. 

De derde factor liet zien dat er een groot verschil is in de uitspraak van bepaalde 
klinkers tussen oudere en jonge sprekers in bijna het hele taalgebied: jonge sprekers 
spreken midden voorklinkers (de klinkers in lett, dér, 16s, s6t, lar en leta) meer open 
uit dan oudere sprekers. 


Aggregatie 


In dialectometrisch onderzoek is het uitgangspunt niet de analyse van afzonderlijke 
taalvariabelen, maar analyse van het geheel (alle variabelen samen). Hiervoor zijn 
geageregeerde taalkundige afstanden tussen dialecten gemeten. Het doel van dia- 
lectometrisch onderzoek is over het algemeen het maken van een dialectclassificatie 
met behulp van computationele methoden. De nadruk ligt niet op de linguistische 
kenmerken van elke dialectgroep, maar op het analyseren van de relaties tussen taal- 
variéteiten als alle beschikbare data geaggregeerd zijn. In hoofdstuk 7 van dit proef- 
schrift is een dialectometrische analyse van klinkeruitspraak in het Zweeds gedaan. 
Voor het berekenen van de taalkundige afstand tussen variéteiten is de Euclidische 
afstandsfunctie gebruikt (formule 6.1, p. 102). 

Statistische methoden die vaak zijn gebruikt in dialectometrisch onderzoek zijn 
cluster-analyse en multidimensionale schaling (MDS). Voor beide technieken is het 
uitgangspunt een afstandenmatrix van de geaggregeerde taalkundige afstanden tus- 
sen alle onderzochte dialectparen. Met clusteranalyse worden dialecten ingedeeld in 
groepen, terwijl MDS de relaties tussen dialecten als een continuiim beschrijft. In 
2001 introduceerden Tibshirani, Walter en Hastie de GAP statistic voor het bepalen 
van het aantal significante clusters in clusteranalyses. In de Zweedse klinkerdata 
van dit proefschrift heeft de GAP statistic geen significante clusters kunnen iden- 
tificeren, wat betekent dat de Zweedse dialecten een echt continuiim vormen wat 
klinkeruitspraak betreft. Voor de analyse van de geaggregeerde taalkundige afstan- 
den is daarom in dit proefschrift MDS gebruikt. 

Een aantal verschillende MDS-analyses is uitgevoerd. In de eerste analyse is de 
geografische variatie in klinkeruitspraak geanalyseerd op basis van gemiddelde PC 
scores per plaats. De MDS-analyse laat zien dat er, ondanks het dialectcontinutim 
zonder scherpe grenzen, coherente dialectgebieden binnen het continuiim te vinden 
zijn. Deze gebieden komen overeen met traditionele indelingen van de Zweedse 
dialecten. 

In een volgende stap is het materiaal ingedeeld in twee leeftijdsgroepen per plaats. 
Deze analyse laat zien dat de taalkundige afstanden tussen de plaatsen veel kleiner is 
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voor de jonge sprekers dan voor de oudere sprekers. Dit wijst erop dat er nivellering 
van de dialecten plaatsvindt. In een aantal centrale plaatsen dichtbij de grootste 
steden Stockholm en G6teborg is de afstand in klinkeruitspraak tussen jonge en 
oudere sprekers het grootst. In meer perifere delen van het taalgebied, bijvoorbeeld 
in het zuiden, het noorden en in het Zweedstalige gebied van Finland, lijken de 
dialecten stabieler te zijn. Ondanks het feit dat de afstanden tussen dialecten kleiner 
worden, kunnen ongeveer dezelfde geografische dialectgebieden herkend worden in 
de klinkeruitspraak van jonge sprekers als van oudere sprekers. De dialectgebieden 
blijven dus bestaan ook al worden de taalkundige afstanden kleiner. 

In de laatste MDS-analyse zijn de sprekers ingedeeld in vier groepen per plaats 
(jonge vrouwen, jonge mannen, oudere vrouwen en oudere mannen). Volgens deze 
analyse zijn de verschillen in klinkeruitspraak veel groter tussen de twee leeftijds- 
groepen dan tussen mannen en vrouwen. In de oudere leeftijdsgroep is het verschil 
tussen mannen een vrouwen iets groter dan in de jongere leeftijdsgroep. 


Nivellering van de dialecten en taalverandering 


Een van de methodologische doelstellingen van dit proefschrift was het vergelijken 
van de dialectometrische benaderingswijze (aggregatie) met de meer traditionele 
methode binnen het dialectologisch onderzoek, namelijk de analyse van afzonder- 
lijke taalvariabelen. Beide analyses hebben, zoals hierboven beschreven, een grote 
nivellering van de dialecten laten zien. De analyse van de verschillende klinkers 
laat duidelijk zien welke klinkers aan het veranderen zijn, terwijl de geaggregeerde 
analyse laat zien op welke plaatsen de totale taalverandering het grootst is. Een 
combinatie van de twee methoden kan dus meer inzicht geven over de taalsituatie 
dan elke analyse apart. 

Op basis van de resultaten van beide beschreven methoden kunnen conclusies 
getrokken worden over het type taalverandering dat nu aan de gang is in Zweed- 
se dialecten. De taalverandering in Zweedse klinkeruitspraak kan gekarakteriseerd 
worden als een combinatie van convergentie aan de standaardtaal en dialectcontact. 
Sommige klinkers laten een duidelijke convergentie aan de standaardtaal zien (bijv. 
de klinkers in lar en dér). Voor andere klinkers houdt de verandering innovatie in 
(bijv. sét). 

Het ziet er naar uit dat de allofonische variatie in de Standaardzweedse klinkers 
<&> en <6> aan het verdwijnen is. In het Standaardzweeds is de uitspraak van 
deze twee klinkers voor een <r> meer open geweest dan in andere contexten. Veel 
dialecten daarentegen kenden geen allofonische variatie, maar hadden een meer ge- 
sloten uitspraak in alle fonologische contexten. In de nieuwe uitspraak, die veel van 
de jongere sprekers in dit onderzoek hebben, worden alleen maar open varianten 
van de klinkers <A> en <6> gebruikt. Terwijl de opening van de klinkers in ldr en 
dor convergentie aan de standaardtaal betekent, wordt een klinkersysteem zonder 
allofonische varianten, wat kenmerkend is geweest voor veel dialecten maar niet voor 
het Standaardzweeds, bewaard. 


Sammanfattning pa svenska 


Vokaluttal har lange ansetts uppvisa stor variation inom det svenska sprakomradet 
och vara karakteriserande f6r sa val dialekter som regionala varieteter av standard- 
svenska. Trots detta finns det fa unders6kningar som inkluderar hela sprakomradet 
och som med instrumentella metoder beskriver variationen i vokaluttal. Syftet med 
denna avhandling var att beskriva den geografiska variationen i vokaluttalet inom 
det svenska sprakomradet. Fragestallningar for unders6kningen var: 


e Vilken ar den geografiska fordelningen av dialektala drag i vokaluttal? 
e Finns det vokaldrag som uppvisar samvariation? 


e Hur stor ar den dialektala variationen och i vilken utstrackning finns drag fran 
de lokala dialekterna bevarade? 


e I vilka omraden f6randras dialekterna? Vilka dialekter ar stabila? 

e Hur stor skillnad i vokaluttal finns det mellan aldre och yngre talare? 
e Vilka vokaler forandras och i vilken riktning? 

e Finns det k6nsrelaterad variation i vokaluttal i svenska dialekter? 

e Hur kan svenska dialekter klassificeras utifran vokaluttal? 


e Staémmer en indelning av moderna varieteter av svenska 6verens med tradi- 
tionella dialektindelningar? 


e Stammer en indelning pa basis av vokaluttal 6verens med dialektindelningar 
pa basis av andra lingvistiska nivaer? 


Materialet for undersékningen omfattar vokalmaterial ur SweDia2000-databasen!. 
Materialet spelades in under aren 1998-2001 pa sammanlagt 98 orter inom det sven- 
ska sprakomradet. Pa varje ort gjordes inspelningar med bade aldre och yngre talare 
av bada k6nen. De flesta av de aldre informanterna var ca 55-75 ar gamla, medan 
de yngre informanterna var ca 20-35 ar. Pa de flesta orter intervjuades samman- 
lagt tolv informanter: tre aldre kvinnor, tre aldre man, tre yngre kvinnor och tre 


1 <http://swedia.ling.gu.se/> 
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yngre man. Det totala antalet dialekttalare som analyserades for denna avhandling 
var 1 170. Darutéver ingick referensvokaler fran tolv informanter som representerar 
standardsvenska. 

Sammanlagt 19 vokaler ingick i analysen. Dessa utgjordes av stamvokalerna i or- 
den: dis, disk, dér, dérr, flytta, lass, lat, leta, lett, lott, lus, las/lat, lar, lds, nat, sot, 
sark, sot och typ. I denna uppsattning ingar alla standardsvenskans langa vokaler 
samt de flesta korta vokaler. Orden eliciterades med hjalp av gator och varje infor- 
mant upprepade orden tre till fem ganger. Vokalerna eliciterades med samma, ord 
over hela sprakomradet. Det enda undantaget fran denna regel utgér langt @ som 
eliciterades med ordet /és pa en del orter och med /dt pa andra. 

Vokaluttalen analyserades akustiskt, och resultaten av den akustiska analysen 
analyserades dels pa sprakdragsniva, dels pa sprakartsniva. 


Akustisk analys 


Den metod som har anvants mest av dialektologer och sociolingvister f6r att mata 
vokalkvalitet akustiskt ar formantmatningar. F6r denna undersdkning valdes dock 
en annan metod for akustisk analys. Vokalspektrumen filtrerades med Barkfilter upp 
till 18 Bark, och denna filterbankrepresentation reducerades darefter till tva artiku- 
latoriskt meningsfulla komponenter med hjalp av principalkomponentanalys (PCA). 
Barkfilter motsvarar den kritiska bandbredden hos basilarmembranet i manniskans 
innerGra, vilket gor att en representation i Barkfilter modellerar den manskliga per- 
ceptionen. Metoden introducerades av Plomp, Pols och Van de Geer ar 1967, och 
Jacobi visade 2009 att metoden lampar sig for analys av geografisk och social vari- 
ation i vokaluttal i stora samlingar dialektmaterial. 

En orsak till att PCA av Barkfilter lampar sig val for analys av stora mangder 
vokalmaterial ar att denna metod kan automatiseras mer tillf6rlitligt 4n formant- 
analys. Automatiserade formantmatningar innehaller alltid en viss del felmatningar 
som maste korrigeras for hand. I motsats till formantmatningar kan Barkfiltrering 
och PCA automatiseras till fullo. Trots detta har en hel del manuellt arbete kravts 
ocksa for denna analys. Allt vokalmaterial har segmenterats och transkriberats ma- 
nuellt inom SweDia-projektet och dess uppf6ljningsprojekt SweDat. 

Den valda metoden ar dock inte helt problemfri. Eftersom information fran 
hela vokalspektrumet anvands paverkar mangden bakgrundsbrus i inspelningarna 
matvardena. Alla dialektinspelningar i materialet gjordes under relativt liknande 
inspelningsforhallanden (f6r det mesta i ett tyst rum i informanternas hem), vilket 
innebér att inspelningarna inte avviker sarskilt mycket fran varandra vad giller 
brusniva. Informanterna som representerar standardsvenska spelades daremot in i 
en studio och dessa inspelningar har darfér en avsevart hégre signal/brus-kvot an 
dialektinspelningarna, vilket visade sig paverka PC A-vardena. 

Ett stort problem f6r all akustisk matning av vokalkvalitet ar den individuella 
variation som beror pa fysiologiska/anatomiska skillnader i talapparaten. Ett langre 
ansatsror ger t.ex. lagre formantfrekvenser och PCA-varden an ett kortare ansatsrGr. 
Som lyssnare anpassar vi oss omedelbart till olika talare och normaliserar automa- 
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tiskt de systematiska skillnader i den akustiska signalen som finns mellan olika talare. 
Men fragan hur man ska normalisera dessa skillnader i akustiska matningar ar inte 
fullstandigt lost. En fragestallning infér den akustiska analysen i denna avhandling 
var darfor i vilken man det gar att reducera de individuella skillnaderna i de akustiska 
matten. 

Den genomsnittliga skillnaden i vokal rum ar stor mellan kvinnor och man, efter- 
som man i genomsnitt har langre ansatsr6ér an kvinnor. Att normalisera de systema- 
tiska skillnaderna i de akustiska matten mellan kvinnor och man var viktigare an att 
normalisera bort all individuell variation, eftersom antalet kvinnliga och manliga in- 
formanter inte var konstant mellan alla orter i undersdkningen. Genom att anvanda 
medeltal av ett antal talare per dialekt kan man namligen utga fran att skillnader 
som beror pa fysiologi/anatomi jémnas ut i viss man, men om kénsférdelningen inom 
grupperna ar ojamn sa kommer grupper med fler kvinnor 4n min genomgaende att 
ha hégre medeltal an grupper med fler man an kvinnor. Ett syfte for den dialekto- 
logiska analysen var ocksa att undersdka lingvistiska skillnader mellan kvinnor och 
man, vilket ar omdjligt om de akustiska matten ar paverkade av anatomiska, skill- 
nader mellan k6nen. En lésning visade sig vara att tillimpa PCA separat pa data 
fran kvinnliga respektive manliga talare. Detta ledde till en signifikant reduktion av 
skillnaderna i de akustiska matten mellan kvinnor och man jaémfért med en analys 
dar bada kénen inkluderades i en och samma analys. Ocksa den totala variationen 
mellan talare reducerades i viss man genom detta forfaringssatt. 

FGr en mindre del av materialet (tre orter) gjordes ocksa formantmatningar, vilka 
korrelerades med resultaten av principalkomponentanalysen. Korrelationen mellan 
den férsta principalkomponenten (PC1) och den férsta formanten (F1) var mycket 
hég (r = 0,88 for bada kénen). Korrelationen mellan den andra principalkomponen- 
ten (PC2) och den andra formanten (F2) var nagot lagre (man: r = 0,73; kvinnor: 
r = 0,74), vilket innebér att PC2 i nagot hégre man an PC1 paverkas av annan 
information i spektrumet an av formanter. En multivariat variansanalys visade att 
F1 och F2 i Bark separerar olika vokaler nagot battre an PC1 och PC2, medan den 
odnskade effekten av k6n var betydligt hégre i formantvardena an i principalkompo- 
nenterna. 

PC1 ar ett ungefarligt matt pa vokalhéjd, medan PC2 framst skiljer mellan 
framre och bakre vokaler. Dessa f6rhallanden askadliggGrs tydligt t.ex. i graferna i 
figur 5.8 pas. 66. 

Infor analysen av dialektal och social variation i vokaluttal raknades gruppme- 
deltal av PC-vardena ut. Tre olika grupperingar av informanterna anvandes i analy- 
serna: 


e en grupp per ort (i genomsnitt tolv informanter per grupp) 
e tva grupper per ort: aldre och yngre (i genomsnitt sex informanter per grupp) 


e fyra grupper per ort: aldre kvinnor, aldre man, yngre kvinnor och yngre man 
(i genomsnitt tre informanter per grupp) 
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Analys pa sprakdragsniva 


I kapitel 6 redovisas ett antal analyser pa sprakdragsniva. Alla analyser i detta kapitel 
bygger pa en indelning av informanterna i tva grupper per ort, dvs. aldre och yngre 
talare. Varje grupp omfattar i medeltal tre kvinnor och tre man. I anslutning till 
detta kapitel tillverkades kartor som visar PC-vardena per ort och aldersgrupp fér 
varje vokal. Dessa kartor finns i bilaga C och kan aven ses som en separat sprakatlas 
6ver vokaluttal i svenska dialekter. 

En jamforelse av den genomsnittliga mangden variation per vokal visade att 
de langa vokalerna varierar mer an de korta bade geografiskt och mellan de tva 
aldersgrupperna. De tva vokaler som uppvisar st6rst geografisk variation ar vokalerna 
i dér och sot, medan minst variation finns i vokalerna i disk, lass och sdark. Nastan 
alla vokaler varierar geografiskt mer i den aldre aldersgruppen 4n i den yngre. De 
vokaler dir minskningen i dialektal variation ar st6érst ar vokalerna i ldér och lat. 
Endast vokalen i s6t uppvisar en betydligt stérre geografisk variation bland yngre 
an bland aldre talare. 

For att identifiera drag som uppvisar samvariation utférdes en faktoranalys. En 
del av faktorerna identifierade distinkta dialektgrupper i materialet, medan andra 
faktorer visade pa en variation i form av ett kontinuum. 

Den férsta faktorn hanger ihop med PC2 av alla framre vokaler i materialet. 
Dialekterna i Svealand och Finland har liknande varden pa denna faktor, medan 
dialekterna i G6taland och Norrland uppvisar ett annat ménster. Eftersom sa man- 
ga vokaler ar involverade 4r det osannolikt att det ar artikulationen av de enskilda 
vokalerna som uppvisar ett sa systematiskt variationsmO6nster. En alternativ hy- 
potes ar att denna faktor har fangat skillnader i réstkvalitet, som ju ar nagot som 
paverkar spektrumet hos alla vokaler. Analyserna i denna avhandling kan inte ge na- 
got direkt svar pa fragan, men dialektala skillnader i réstkvalitet har konstaterats av 
tidigare forskare. Elert redogjorde 1983 fr att det finns regionala skillnader i svenska 
i anvandning av bl.a. knarr, nasalitet och luftfylld rést. Detta ar nagot som borde 
analyseras noggrannare med instrumentella metoder i framtida unders6kningar. 

Den andra faktorn identifierade den sydsvenska diftongeringen av langa vokaler, 
som ar starkast hos slutna vokaler. Ett antal faktorer visade drag som skiljer de 
gotlandska dialekterna fran Gvriga varieteter av svenska (framst diftongeringar). 
Ocksa sardrag hos dialekterna i Jamtland och i Norrbotten identifierades av ett 
antal av faktorerna. 

Den tredje faktorn visade pa en stor skillnad mellan aldre och yngre talare i sa 
gott som hela sprakomradet. De yngre talarna uttalar framre mellanvokaler (vokaler- 
na i lett, dér, lds, sét, lar och leta) betydligt 6ppnare an de aldre talarna. Detta ar 
nagot som har uppmarksammats tidigare; i Eskilstuna (Nordberg, 1975; Hammer- 
mo, 1989; Aniansson, 1996), i Stockholm (Kotsinas, 1994) och i och i narheten av 
Goteborg (Andersson, 1994; Grénberg, 2004). Resultaten i denna avhandling visar 
for forsta gangen den stora geografiska utbredningen av denna sprakf6randring. 
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Analys pa sprakartsniva 


I motsats till analys av enskilda sprakdrag har den dialektometriska forskningstra- 
ditionen fokuserat pa en helhetsanalys, dar sammantagna lingvistiska avstand mel- 
lan dialekter raéknas ut med stéd av all tillganglig information. Dessa lingvistiska 
avstand anvands sedan for att med hjalp av datorstyrda metoder automatiskt klas- 
sificera dialekter. Huvudmalet i en sadan analys ar inte att beskriva varje dialekt 
utifran vilka sirdrag som ar typiska for dialekten, utan att utifran de sammantagna 
lingvistiska avstanden mellan ett antal dialekter beskriva hur dialekterna f6rhaller 
sig till varandra. I kapitel 7 ges en dialektometrisk analys av vokalmaterialet. Som 
avstandsmatt for att berakna det lingvistiska avstandet mellan varieteter anvandes 
Euklidiskt avstand (formel 6.1, s. 102). 

Vanliga statistiska metoder i dialektometrisk forskning ar klusteranalys och mul- 
tidimensionell skalering (MDS). Bada analysmetoderna bygger pa en matris med de 
sammantagna lingvistiska avstanden mellan alla varieteter parvis. Klusteranalys de- 
lar in dialekterna i grupper, medan MDS beskriver dialektkontinuum. GAP-mattet 
som introducerades av Tibshirani, Walter och Hastie 2001 kan anvandas for att hitta 
antalet statistiskt signifikanta kluster i en klusteranalys. Nar detta matt tillampades 
pa materialet for denna avhandling visade det sig att inga signifikanta kluster kan 
identifieras, vilket innebar att de svenska dialekterna bildar ett genuint kontinuum 
vad giller vokaluttal. Fér den fortsatta analysen anvandes darfor MDS. 

Ett antal olika MDS-analyser utf6rdes. I den forsta analysen analyserades geo- 
grafisk variation pa basis av ortmedeltal av PC-vardena. Denna analys visade att 
aven om dialekterna bildar ett kontinuum utan skarpa dialektgranser, gar det att 
identifiera vissa mer sammanhangande dialektomraden. Dessa stémmer i stora drag 
éverens med den klassiska indelningen av svenska dialekter av Elias Wessén (sydsven- 
ska mal, gdtamal, sveamal, norrlandska mal, dstsvenska mal, gotlaéndska mal). 

I foljande steg indelades materialet enligt de tva aldersgrupperna per ort. Analy- 
sen visade att de dialektala skillnaderna mellan olika orter ar betydligt mindre i den 
yngre aldersgruppen An i den aldre, vilket visar pa en storskalig dialektutjamning. De 
dialekter som uppvisar stérst forandring i vokaluttal ar manga av de centralsvenska 
dialekterna i nérheten av Stockholm och Goteborg. Dialekter i mer perifera delar av 
sprakomradet, som Skane, Gotland, Finland, delar av Norrland och aven omradet 
runt Vanern, visade sig vara betydligt mer stabila. Trots att de lingvistiska avstan- 
den mellan dialekterna haller pa att krympa 4r de st6rre dialektomraden som gar att 
urskilja ungefaér desamma f6r yngre som for aldre talare. Den geografiska indelningen 
bestar alltsa &éven om de dialektala skillnaderna blir mindre. 

En indelning av informanterna enligt bade alder och k6n ledde till en analys 
med fyra informantgrupper per ort (aldre kvinnor, aldre man, yngre kvinnor och 
yngre man). Denna analys visade att generationsskillnaderna ar betydligt stérre an 
kénsskillnaderna i vokaluttal. For de aldre informanterna ar skillnaden mellan man 
och kvinnor stérre 4n for de yngre talarna. 
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Dialektutjamning och sprakforandring 


Ett av de metodologiska malen f6r denna avhandling var att jamf6ra den dialek- 
tometriska synvinkeln, dér man ser dialekterna som en helhet, med analys pa de- 
taljniva, som varit den mer traditionella analysmetoden inom dialektologi. Bade 
analysen pa sprakdragsniva och analysen pa sprakartsniva har, som ovan beskri- 
vits, visat pa en pagaende dialektutjamning. Analysen pa sprakdragsniva visar vilka 
vokaler som ar utsatta for st6rst forandring, medan analysen pa sprakartsniva avsl6- 
jar i vilka omraden dialekterna som helhet f6randras mest. En kombination av dessa 
tva metoder kan saledes leda till en battre forstaelse av den dialektala variationen 
an vardera analys for sig. 

Analysen pa sprakartsniva visade att de centralsvenska dialekterna ar utsatta 
for stérst forandring. Analysen pa sprakdragsniva visade att framfor allt framre 
mellanvokaler uppvisar stor skillnad mellan aldre och yngre talare. Vad giller dessa 
vokaler kan man urskilja olika foraéndringstendenser. Vokalerna i lér och dér har haft 
ett mycket 6ppet uttal i upplandska varieteter inklusive standardsvenska, medan 
varieteter vasterut i Sverige har haft ett relativt slutet uttal. Materialet i denna 
avhandling visar att det 6ppna uttalet haller pa att breda ut sig, vilket innebar en 
forandring i riktning mot standardspraket. 

Vokalen i ndt har a sin sida haft ett mer slutet uttal i dialekterna i Uppland an i 
standardsvenska (Stockholms-e). Detta slutna uttal i Uppland ser ut att halla pa att 
forsvinna och ersattas av ett 6ppet d. Samtidigt kan man notera en 6ppningstendens 
av denna vokal ocksa i 6vriga delar av sprakomradet. Det standardsvenska uttalet 
[e:] haller pa att ersadttas av ett dnnu 6ppnare uttal som ligger nérmare det som 
tidigare upptradde bara framf6r r och retroflexa konsonanter: [ze:]. En liknande 
6ppningstendens kan noteras for 6 (eliciterat med Ids och sét). 

For bade 4 och 6 gialler att ett Gppnare uttal framfor r och retroflexa konsonanter 
har anvants framfo6r allt i Uppland och i de 6stra delarna av sprakomradet, medan 
man i vaster saknat allofoni hos dessa vokaler och har uttalat vokalerna slutet i alla 
fonologiska kontexter. Analyserna i denna avhandling visar att avstandet mellan de 
tva allofonerna av bade @ och 6 haller pa att minska i Uppland och ocksa hos talare 
som representerar standardsvenska. Vasterut och framforallt i det mellansvenska 
omradet, dar aldre talare anvander ett slutet uttal i alla kontexter, Ager en fordndring 
rum, sa att uttalet blir mer 6ppet i alla kontexter hos yngre talare. 

Oppningen av vokalerna i lér och dér kan ses som tillnarmning till standardsven- 
ska, medan 6ppningen av vokalerna i nat och sét ar en novation. Analyserna i denna 
avhandling visar att novationen haller pa att breda ut sig till ett stort antal av de 
svenska dialekterna. Ett 6ppet uttal av d@ och 6 i alla kontexter innebar att allo- 
fonin hos dessa vokaler forsvinner. De allofoniska varianterna av dessa vokaler har 
lange komplicerat fonologiska beskrivningar av det standardsvenska vokalsystemet. 
En forlust av allofonin skulle alltsa betyda en vasentlig forenkling av vokalsystemet. 

Nordberg (1975) beskrev 6ppningen av langt 6 i Eskilstuna som en del av en 
kedjeforskjutning, dar ocksa langt u paverkades och i kvalitet narmade sig det ur- 
sprungliga slutna 6-ljudet. I tillagg till den allofoniska variationen hos é och 6 har 
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u-ljudet ocksa ansetts komplicera fonologiska beskrivningar av svenska vokaler, efter- 
som det har kravts tre rundningsnivaer for att skilja u fonologiskt fran antingen i 
och y eller fran e och 6. Den kedjeforskjutning som Nordberg (1975) beskrev leder 
daremot till ett helt symmetriskt vokalsystem med endast tva grader av rundning. 
Det langa u-ljudet (eliciterat med Jus) uppvisar stor variation i materialet for denna 
avhandling. En 6ppning av vokalen i lus hos yngre talare kan noteras framfor allt i 
Svealand. 

Den beskrivna f6randringen av frémre mellanvokaler ager rum samtidigt som 
det svenska sprakomradet gar igenom en stor dialektutjamning. Sprakf6randringen 
kan ses som ett samspel av tillnarmning till standardspraket och dialektkontakt. 
Samtidigt som 6ppningen av vokalerna i ldér och dér i manga dialekter kan ses som 
en forandring i riktning mot standardsvenska bevaras ett fonologiskt system utan 
allofoni hos @ och 6, som varit kannetecknande fér manga dialekter, medan den 
allofoniska variationen i standardsvenska verkar vara pa vig att forsvinna. 


Appendix A 


Speakers 


The table below shows the number of speakers per site and per speaker group ana- 
lyzed in the thesis. In total the data comprises 1,170 speakers at 98 sites. The 
sites are displayed on a map in Figure 4.1 (p. 51). For analyzing the data three 
different groupings were made: 1) one group per site, 2) two groups per site (older 
and younger speakers), and 3) four groups per site (om = older men, ow = older 
women, ym = younger men, yw = younger women). 

Footnotes indicate how many of the 19 vowels analyzed in the thesis that were 
recorded by each speaker group (no footnote = 19 vowels, ' = 18, 7 = 17, 3? = 16, 4 
= 15). When the number of vowels recorded by a group was less than 15, the group 
was not included in the analyses. That explains why the number of older men is 
289 when the data is divided into one or two groups per site, but only 288 when 
a division into four groups per site is made. From Kramfors only one older male 
speaker was included in the thesis. He did not record 15 of the vowels, which is why 
older men from Kramfors were not included as a group when four groups per site 
were analyzed. However, the data from the one older male speaker is included in 
the two other groupings (that is, one group per site, and older and younger speakers 
per site). 
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younger 


older 


site 


om ow ym yw 


total ym yw 


total om ow 


totalom ow ym yw 


site 


4 


3 
3 


Ankarsrum 13 


12 


Anundsjé 


12 


Arjeplog 
Asby 


3! 


Aspas 


Bara 


3 


Bengtsfors 12 


3! 


Berg 


3 


11 


Bjurholm 


Borga 


3 


13 


Bredsatra 
Broby 


Brand6 


11 


Burseryd 


12 


Burtrask 
Béda 


Dalby 


51 


6! 


Delsbo 


3 


Dragsfjard 11 


12 


Fjallsjo 
Floby 
Fole 


12 


Frillesas 


2 
3 


Frostviken 11 


Frandefors 13 


Farila 


3 


Grangarde 12 


Gras6 


12 


Gasborn 


Hammaré 


3 


12 


Hamneda 


Haraker 


6! 


6! 


3 


12! 


Houtskar 
Husby 
Indal 


Jaémshég 
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site older younger 
site totalom ow ym yw total om ow total ym yw om ow ym yw 
Jarnboas 12 3 3 3. 8 6 3 8 6 3 38 3 3 3 8 
Jarsnis 12 3 3 3 8 6 3 38 6 3 38 3 3 3 8 
Kalix 13 4 3 3 8 7 4 8 6 3 38 4 3 3 8 
Korsberga 18 4 3 3. 8 7 4A 8 6 3 38 4 3 3 8 
Kramfors 107 1 2 3 4 33) 1 2 7? 3 4 o 2 3? 4 
Kyrkslatt 12 3 3 3 8 6 3. 38 6 3 38 3 3 3 8 
Karna 12 3 3 3 8 6 3 38 6 3 38 3 3 3 8 
Karsta 12-3. (85 53> 3 39-33 6 3 38 3 3 3 
Leksand 138 3 3 4 8 6 3 38 7 4 3 3 3 4 8 
Lillhardal 18 3 4 3. 38 73 ~C~«dA 6 3 3 334° 3 8 
Lanna 10 2 3 3 2 5 2 8 5 3 2 2 3 83 2 
Liderup 11 3 3 3 2 6 3 8 5 3 2 3 3 3 2 
Malung 12 3 3 3 8 6 3 38 6 3 38 3 3 3 8 
Nederlulea 12 3 3 3. 8 6 3 38 6 3 38 3 3 3 8 
Nora 11 2 3 3 8 5 2 8 6 3 38 2 3 3 8 
(Tarnsjé) 
Norra 13 4 3 3 8 7 4A 8 6 3 38 4 3 3 8 
R6rum 
Nysitra 11 3 2 3 8 5 3 2 6 3 38 3 2 3 8 
N&rpes 121 3 3 3 8 6! 3) COB 6' 3.8 3! 3' 3! 3! 
Ockelbo 12 3 3 3. 8 6 3 38 6 3 3 3 3 3 8 
Orust 144 3 4 3 4 7 38 4 7 3 =#4 3 4 8 4 
Ovanaker 12 3 3 3. 8 6 3 38 6 3 38 3 3 3 8 
Pitea 12) 3%. 334 82-33 6 3 38 6 3 38 3 3 3 8 
Ragunda 12 3 3 3. 8 6 3 38 6 3 38 3 3 3 8 
Rimforsa 12 3 3 3. 8 6 3 38 6 3 38 3 3 3 8 
Sankt 12' 3 3 4 2 6' 3. 38 66 4 2 3! 3! 41 Ql 
Anna 
Saltvik 12 3 3 3 8 6 3 3 6 3 38 3 3 3 8 
Segerstad 13 3 4 3. 8 7 3 4 6 3 38 3 4 3 8 
Skee 12) .3 3 93 38 6 3 38 6 3 38 3 3 3 8 
Skinn- 144 3 4 3 4 7 3 #4 7 3 #4 3 4 8 4 
skatteberg 
Skog 12, 33. 3, “3B. 33 6 3 38 6 3 38 3 3 3 8 
Skuttunge 12 3 3 3. 8 6. -3..3 6 3 38 3 3 8 
Snapper- 10 3 3 3 1 6 3 8 4 3 1 
tuna 


Sorunda 12 2 3 3 4 5 2 3 7 3 4 2 3 3 4 
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site older younger 
site totalom ow ym yw total om ow total ym yw om ow ym yw 
Sproge 12 3 3 3 8 6 3 38 6 3 38 3 3 3 8 
Stenberga 12 3 3 3 8 6 3 38 6 3 38 
Stora 138 4 3 3 8 7 4A 8 6 3 38 3 3 
Mellésa 
Storsjé 12 3 3 3 8 6 3 8 6 3 38 3 3 
Strémsund 12 3 3 3. 8 6 3 38 6 3 38 3 3 
Sarna 112 4 2 3 6 2 #4 5 2. | 33 25 4 2! 8 
Sédra 12 3 3 3 8 6 3 38 6 3 38 3 3 3 8 
Finnskoga 
Tjillmo 11 2 3 3 3 51 2 83 6 3 38 2) 3! 3 8 
Torhamn 13 3 4 3. 3 7 3 4 6 3 8 3 4 3 8 
Torp 10 3 2 3 2 5 38 2 5 3 2 3 3 2 
Torsas 12 3 3 3 8 6 3 38 6 3 38 3 3 3 
Torsé 12 3 3 3 8 6 3 8 6 3 38 3 3 3 
Vemdalen 12 3 3 3. 3 6 3 3 6 3 3 3 3} 3' 8 
Viby 12 3 3 3 8 6 3 38 6 3 38 3 3 3 
Vilhelmina 13 3 4 3 3 7 3 4 6 3 38 3 3 
Villberga 12 3 3 3 8 6 3 8 6 3 38 3 3 3 
Vinden 84 0 3 3 2 34 0-8 54 30 C2 Oo 34 34 24 
Vackelsing 12 3 3 3 8 6 3 38 6 3 38 3 3 
Vastra 12 3 3 3 «8 6 3 38 6 3 38 3 3 3 8 
Vingaker 
Vaxtorp 12 3 3 3 8 6 3. 38 6 3 38 3 3 3 8 
Vora 1273 3 3 38 6 3 3 6 3 3 3? 3? 3? 3? 
Are 12 3 3 3 8 3 3 3 3 
Arstad- 18 4 3 3 8 4 8 3 83 3 3 
Heberg 
Arsunda 12 3 3 3. 8 6 3. 38 6 3 38 3 3 8 
Ossjé 144 4 3 3 4 7 4 8 7 3 #4 4 38 3 4 
Ostad 12 3 3 3 8 6 3 38 6 3 38 Sy eo OS 
Overkalix 10 2 2 3. 38 4 2 2 6 3 3 2 2 3 8 
Oxabick 12 3 3 3 8 6 3 38 6 3 38 3 3 3 8 
Total 1170 589 581 288 300 291 290 


289 300 291 290 289 300 291 290 


Ee Ww NY 


: 18 vowels 
: 17 vowels 
: 16 vowels 
: 15 vowels 
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Appendix B 


Cartographic methods 


All maps in this thesis were created using the RuG/L04! software. Maps displaying 
linguistic results are found in Chapters 6, 7 and 8, and in Appendix C. In these 
maps the data sites are marked with a black dot. The dots are surrounded by a 
round area, which has a color that represents the site linguistically. When two or 
more data sites are so close to each other that the circles would overlap, smaller 
polygon shaped areas were computed using Delaunay triangulation (Heeringa, 2004, 
161-162). 

The maps display linguistic continua and were created with the maprgb function 
in RuG/L04, which uses the RGB color model. The model uses the three basic 
colors red, green, and blue. By mixing these colors in different proportions a color 
spectrum is created. Figure B.1 shows the RGB color spectrum as a cube. In this 
three-dimensional space the amount of red represents the first dimension, the amount 
of green the second dimension and the amount of blue the third dimension. Because 
RGB is an additive color model, black is defined by 0% of all three components, 
while white is obtained by adding the full amount of all three colors. By mixing two 
primary colors the secondary colors cyan, magenta and yellow are formed. 

Maps created with the RGB color model can represent one, two or three linguistic 
variables simultaneously. When three variables are displayed the full RGB color 
spectrum is used to create the maps. This technique was used for displaying the 
results of multidimensional scaling in Chapter 7. In § B.1 below the technique is 
described in more detail. 

In the maps in Appendix C, the two principal components (PCs) of the acoustic 
analysis of the 19 vowels are displayed. This is done by using a two-dimensional slice 
of the RGB color spectrum described in § B.2. The same two-dimensional model 
was used to visualize the two first dimensions of the MDS analysis in § 7.2.1. 


TRuG/LO04: software for dialectometrics and cartography. By P. Kleiweg, University of Gro- 
ningen. <http://www.let.rug.nl/kleiweg/L04/> 
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The simplest maps display only one linguistic variable. § B.3 describes how this 
one-dimensional color spectrum was created. The technique was used to display, for 
example, factor scores in § 6.3 and the values of each dimension of multidimensional 
scaling in Chapter 7. 


B.1 Three-dimensional maps 


For displaying the results of multidimensional scaling (MDS) in Chapter 7 the full 
RGB color spectrum was used. Figure B.1 shows a simplified RGB color spectrum as 
acube. The cube shows only seven steps in each dimension. In reality each dimension 
can take values between 0 and 255, resulting in a much smoother spectrum. 

MDS assigns positions in a three-dimensional space to all varieties included in 
the analysis (see § 7.1). By using the RGB color model all positions in the three- 
dimensional MDS space can be translated to a distinct color. The amount of red 
represents the first dimension of the MDS, the amount of green the second dimension 
and the amount of blue the third dimension. Coloring the area of each variety on a 
map with the color corresponding to the position assigned by MDS links the results 
of MDS to geography. 

Dialects that are found at the outer ends of the linguistic continuum are repres- 
ented by the colors in the corners of the cube in Figure B.1, where all three color 
components are added to either the minimum or the maximum extent: 


e red (R: 100%, G: 0%, B: 0%) 
e green (R: 0%, G: 100%, B: 0%) 
e blue (R: 0%, G: 0%, B: 100%) 
e cyan (R: 0%, G: 100%, B: 100%) 
e magenta (R: 100%, G: 0%, B: 100%) 
e yellow (R: 100%, G: 100%, B: 0%) 
e black (R: 0%, G: 0%, B: 0%) 
e white (R: 100%, G: 100%, B: 100%). 
In the center of the cube, where all three components are added to around 50%, 


grayish colors are found. Dialects with average values in all three dimensions of the 
MDS are represented by the colors in the center of the color spectrum. 
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B.2 Two-dimensional maps 


For representing two linguistic variables a two-dimensional color spectrum is needed. 
This can be created by letting two of the primary RGB colors represent one variable 
and the third primary color represent the second variable. Figure B.2 shows the 
resulting color spectrum. Red and green are added to the same proportion, rep- 
resenting the first variable, while blue represents the second variable. The origin is 
placed in the upper right corner of the figure. Comparing the two-dimensional color 
spectrum in Figure B.2 to the full color spectrum in Figure B.1, it can be seen that 
the two-dimensional spectrum uses a slice from the full color spectrum reaching from 
the lower left corners to the upper right corners. Two-dimensional maps are used 
for visualizing the scores of the acoustic analysis of the vowels in Appendix C, and 
for visualizing the two first dimensions of the MDS analysis in § 7.2.1. 


B.2.1 Displaying two dimensions of MDS 


When the two-dimensional color spectrum is used for visualizing the two first di- 
mensions of MDS (Figure 7.2, p. 134) low values in both dimensions lead to black 
and high values in both dimensions to white color. High values in the first dimension 
and low in the second results in yellow colors on the map, while low values on the 
first dimension and high on the second dimension leads to blue. 


B.2.2 Displaying acoustic PCs 


In the maps in Appendix C, the two principal components (PC1 and PC2) of the 
acoustic analysis of the vowels (see Chapter 5) are displayed. For calculating the 
scores of the principal component analysis, the regression method, which produces 
scores with a mean of 0 and a standard deviation of 1 for each component, was used. 
The color spectrum was scaled so that scores < —1.5 got 0% color, while scores 
> 1.5 got 100% color. As Figure 5.8 (p. 66) shows, the range —1.5 to +1.5 roughly 
covers an area including the one standard deviation ellipses of each of the Swedish 
point vowels. Using this method for assigning colors has the following implications: 


e PC1 < —-1.5A PC2 < —1.5 = black 
e PCL >1.5A PC2 < -1.5 = yellow 
e PC1 < —-1.5A PC2 > 1.5 = blue 
e PC1>1.5A PC2 > 1.5 — white 


Vowels with scores between these corner points are assigned colors in the spectrum 
between these extremes as shown in Figure B.2. 

As vowels plotted in the PC plane show a correspondence to the IPA vowel 
quadrilateral (see Figure 5.8, p. 66), the extreme colors in the spectrum can roughly 
be interpreted as the point vowels [i] (blue), [se] (white), [a]/[a] (yellow) and [u] 
(black). PC1 correlates highly with F1 (see § 5.2), which means that it roughly 
corresponds to vowel height. Accordingly, light and yellowish colors indicate open 
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vowels, while blue and dark colors indicate close vowels. PC2 corresponds roughly to 
F2 (and, hence, to vowel backness), but the relationship between PC2 and F2 is not 
as strong as between PC1 and F1 (see § 5.2, Figure 5.13). In addition to F2, PC2 is 
also influenced by higher frequency areas in the spectrum (see § 5.2.3). Bluish and 
light colors indicate higher PC2 values than dark and yellow colors, which is related 
to vowel backness, but also to the intensity level in frequency areas above F2. 

The maps displaying the PC1 and PC2 values give an overview of the dia- 
lectal variation in the spectral properties of the vowels. The maps can roughly 
be interpreted in articulatory terms. However, the PCs represent vowels in a two- 
dimensional space, which means that, for example, vowel roundness is not represen- 
ted in a simple way in the PC plane. 


B.3. One-dimensional maps 


A model that displays a linguistic continuum based on only one variable was needed 
for displaying the results of factor analysis in § 6.3 and for displaying the values of 
each dimension of MDS separately in Chapter 7. In these maps, a scale between 
the primary color green (0% red and blue, 100% green) and its complementary color 
magenta (100% red and blue, 0% green) was used. 

Figure B.3 shows the color spectrum. This one-dimensional color spectrum is 
equal to the diagonal from the left upper front corner to the right lower back corner 
of the full RGB color spectrum in Figure B.1. The amount of red and blue is 
proportional to the value of the variable displayed, while the amount of green is 
inversely proportional to the value of the variable displayed. Adding 50% of all 
three colors gives gray. Hence, greenish colors indicate values below the average 
value of the variable displayed, while magenta-hued colors mean values above the 
average. 
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300 €Q 


red 


Figure B.1. Three-dimensional RGB color spectrum. 


[il [u] 
P 
Cc 
1 

[ze] [a] 


PC2 


Figure B.2. Two-dimensional color spectrum used for displaying PC1 and PC2 
values of the vowels. 


Figure B.3. Green—magenta color continuum. 


Appendix C 


Vowel maps 


This appendix includes maps that visualize the pronunciation of 19 different vowels 
of older and younger speakers at 98 sites in Sweden and the Swedish-language parts 
of Finland. The data is described in Chapter 4 in the thesis and the acoustic analysis 
of the vowel segments in Chapter 5. 

As acoustic measures of vowel quality, two principal components (PCs) were 
extracted from Bark-filtered vowel spectra. The maps display the values of the two 
PCs using a technique described in Appendix B (§ B.2). The data from each site was 
divided into older and younger speakers (approximately six speakers per age group 
per site), and the maps display the average PC scores in the two speaker groups 
at each site. Values measured close to onset (at 25% of the vowel duration) and 
close to offset (75%) of the vowels are displayed separately. In each map, a Standard 
Swedish reference point is included in the upper left corner (rotated square). The 
Standard Swedish vowels were recorded by six older and six younger speakers. 

The maps are organized so that the Standard Swedish corresponding long and 
short vowels (that is, long and short vowels written with the same orthographic 
symbol) are placed adjacently when both vowels are present in the data set. Front 
vowels are presented first, starting with the close front vowels and going on with 
more open front vowels. After the front vowels the back vowels are presented in the 
reversed order, that is, starting with the most open back vowel and ending with the 
closest one. A description and interpretation of all maps is given in § 6.1 in the 
thesis. 
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offset old offset young 


Figure C.1. The vowel in the word dis. Standard Swedish /i:/. Described in § 6.1.1 
(p. 88). 


215 


onset old onset young 


offset old offset young 


Figure C.2. The vowel in the word disk. Standard Swedish /1/. Described in 
§ 6.1.2 (p. 89). 
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onset old onset young 


offset old offset young 


Figure C.3. The vowel in the word typ. Standard Swedish /y:/. Described in 
§ 6.1.3 (p. 90). 
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onset old onset young 


offset old offset young 


Figure C.4. The vowel in the word flytta. Standard Swedish /y/. Described in 
§ 6.1.4 (p. 90). 
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onset old onset young 


offset old offset young 


Figure C.5. The vowel in the word leta. Standard Swedish /e:/. Described in 
§ 6.1.5 (p. 91). 
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onset old onset young 


offset old offset young 


Figure C.6. The vowel in the word lett. Standard Swedish /e/. Described in § 6.1.6 
(p. 92). 
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offset old offset young 


Figure C.7. The vowel in the word lus. Standard Swedish /u:/. Described in 
§ 6.1.7 (p. 92). 
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onset old onset young 


offset old offset young 


Figure C.8. The vowel in the word nat. Standard Swedish /e:/. Described in 
§ 6.1.8 (p. 93). 
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onset old onset young 


offset old offset young 


Figure C.9. The vowel in the word ldr. The open allophone [ze:] of /e:/ in Standard 
Swedish. Described in § 6.1.9 (p. 94). 
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onset old onset young 


ad 


offset old offset young 


Figure C.10. The vowel in the word sdrk. The open allophone [ze] of /e/ in 
Standard Swedish. Described in § 6.1.10 (p. 96). 
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onset old onset young 


offset old offset young 


Figure C.11. The vowel in the word sét. Standard Swedish /o:/. Described in 
§ 6.1.11 (p. 96). 
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onset old onset young 


offset old offset young 


Figure C.12. The vowel in the word lds. Standard Swedish /g:/. Described in 
§ 6.1.12 (p. 97). 
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onset old onset young 


offset old offset young 


Figure C.13. The vowel in the word dér. The open allophone [ce:] of /g:/ in 
Standard Swedish. Described in § 6.1.13 (p. 98). 
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ad 


onset old onset young 


offset old offset young 


Figure C.14. The vowel in the word dérr. The open allophone [ce] of /¢/ in 
Standard Swedish. Described in § 6.1.14 (p. 99). 
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onset old onset young 


offset old offset young 


Figure C.15. The vowel in the word lat. Standard Swedish /a:/. Described in 
§ 6.1.15 (p. 100). 
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onset old onset young 


offset old offset young 


Figure C.16. The vowel in the word lass. Standard Swedish /a/. Described in 
§ 6.1.16 (p. 100). 
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onset old onset young 


offset old offset young 


Figure C.17. The vowel in the words lds and dt. Standard Swedish /o:/. Described 
in § 6.1.17 (p. 100). 
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onset old onset young 


offset old offset young 


Figure C.18. The vowel in the word lott. Standard Swedish /o/. Described in 
§ 6.1.18 (p. 101). 
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onset old onset young 


offset old offset young 


Figure C.19. The vowel in the word sot. Standard Swedish /u:/. Described in 
§ 6.1.19 (p. 101). 
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